Cache for CompositeBuilder<T>
Hi,
While investigating high memory allocations in a production system, I noticed that reading PostgreSQL composite arrays (e.g. embedded_entity[]) via Npgsql leads to significant memory usage caused by multiple instances of:
In a memory snapshot (~7GB total allocations), around 600MB were attributed to CompositeBuilder instances alone, exceeding the memory used by the actual materialized objects.
Observations:
- The column type is a PostgreSQL composite array (e.g. public.embedded_entity[]).
- Reading is done via Npgsql (Dapper on top, but Dapper is not involved in composite parsing).
- Each read operation appears to create new CompositeBuilder instances.
- There is a TODO https://github.com/npgsql/npgsql/blob/main/src/Npgsql/Internal/Converters/CompositeConverter.cs#L67 mentioning a potential AsyncLocal cache for this.
It seems CompositeBuilder is not cached per type, and repeated reads cause redundant allocations of builder metadata.
To help reproduce the issue, I’ve attached a small repro script and a PerfView screenshot.
using Npgsql;
using Testcontainers.PostgreSql;
namespace CompositeBuilderExample;
public class Program
{
public class ParentEntity
{
public int Id { get; set; }
public string Name { get; set; } = string.Empty;
public EmbeddedEntity[] EmbeddedEntities { get; set; } = Array.Empty<EmbeddedEntity>();
}
public class EmbeddedEntity
{
public long Field1 { get; set; }
public long Field2 { get; set; }
public long Field3 { get; set; }
}
public static async Task Main(string[] args)
{
var postgresContainer = new PostgreSqlBuilder()
.WithImage("postgres:16")
.WithDatabase("testdb")
.WithUsername("postgres")
.WithPassword("postgres")
.WithReuse(true)
.Build();
await postgresContainer.StartAsync();
try
{
var connectionString = postgresContainer.GetConnectionString();
await using var setupConnection = new NpgsqlConnection(connectionString);
await setupConnection.OpenAsync();
await setupConnection.ExecuteAsync(@"DROP TABLE IF EXISTS ""ParentEntities"" CASCADE;");
await setupConnection.ExecuteAsync(@"DROP TYPE IF EXISTS parent_entity CASCADE;");
await setupConnection.ExecuteAsync(@"DROP TYPE IF EXISTS embedded_entity CASCADE;");
await setupConnection.ExecuteAsync(@"
CREATE TYPE embedded_entity AS (
Field1 bigint,
Field2 bigint,
Field3 bigint
);
");
await setupConnection.ExecuteAsync(@"
CREATE TYPE parent_entity AS (
Id integer,
Name text,
EmbeddedEntities embedded_entity[]
);
");
// table with composite array column
await setupConnection.ExecuteAsync(@"
CREATE TABLE ""ParentEntities"" (
""Id"" SERIAL PRIMARY KEY,
""Name"" VARCHAR(255) NOT NULL,
""EmbeddedEntities"" embedded_entity[] NOT NULL
);
");
await setupConnection.ExecuteAsync(@"
INSERT INTO ""ParentEntities"" (""Name"", ""EmbeddedEntities"")
SELECT
'Parent ' || generate_series,
ARRAY[
ROW(100::bigint, 200::bigint, 300::bigint)::embedded_entity,
ROW(400::bigint, 500::bigint, 600::bigint)::embedded_entity
]::embedded_entity[]
FROM generate_series(1, 100000);
");
await using var dataSource = new NpgsqlDataSourceBuilder(connectionString)
.MapComposite<EmbeddedEntity>("embedded_entity")
.MapComposite<ParentEntity>("parent_entity")
.Build();
await using var readConnection = await dataSource.OpenConnectionAsync();
var allEntities = await readConnection.QueryAsync<ParentEntity>(@"
SELECT * FROM ""ParentEntities""
");
var data = allEntities.ToList();
}
finally
{
// await postgresContainer.StopAsync();
}
}
}
Questions:
- Is the lack of caching for CompositeBuilder intentional?
- Would it be acceptable to introduce per-type caching (e.g. static or thread-static) for builders?
- Are there known concerns around thread safety or memory retention that prevented caching?
- As a workaround I wrote custom converter - was it the only way for me to avoid allocations of CompositeBuilders?
Thanks for your work on Npgsql!