◐ Shell
clean mode source ↗

Cache for CompositeBuilder<T>

Hi,

While investigating high memory allocations in a production system, I noticed that reading PostgreSQL composite arrays (e.g. embedded_entity[]) via Npgsql leads to significant memory usage caused by multiple instances of:

In a memory snapshot (~7GB total allocations), around 600MB were attributed to CompositeBuilder instances alone, exceeding the memory used by the actual materialized objects.

Observations:

It seems CompositeBuilder is not cached per type, and repeated reads cause redundant allocations of builder metadata.
To help reproduce the issue, I’ve attached a small repro script and a PerfView screenshot.

using Npgsql;
using Testcontainers.PostgreSql;

namespace CompositeBuilderExample;

public class Program
{

    public class ParentEntity
    {
        public int Id { get; set; }
        public string Name { get; set; } = string.Empty;
        public EmbeddedEntity[] EmbeddedEntities { get; set; } = Array.Empty<EmbeddedEntity>();
    }

    public class EmbeddedEntity
    {
        public long Field1 { get; set; }
        public long Field2 { get; set; }
        public long Field3 { get; set; }
    }

    public static async Task Main(string[] args)
    {
        var postgresContainer = new PostgreSqlBuilder()
            .WithImage("postgres:16")
            .WithDatabase("testdb")
            .WithUsername("postgres")
            .WithPassword("postgres")
            .WithReuse(true)
            .Build();

        await postgresContainer.StartAsync();

        try
        {
            var connectionString = postgresContainer.GetConnectionString();

            await using var setupConnection = new NpgsqlConnection(connectionString);
            await setupConnection.OpenAsync();

            await setupConnection.ExecuteAsync(@"DROP TABLE IF EXISTS ""ParentEntities"" CASCADE;");
            await setupConnection.ExecuteAsync(@"DROP TYPE IF EXISTS parent_entity CASCADE;");
            await setupConnection.ExecuteAsync(@"DROP TYPE IF EXISTS embedded_entity CASCADE;");

            await setupConnection.ExecuteAsync(@"
                CREATE TYPE embedded_entity AS (
                    Field1 bigint,
                    Field2 bigint,
                    Field3 bigint
                );
            ");

            await setupConnection.ExecuteAsync(@"
                CREATE TYPE parent_entity AS (
                    Id integer,
                    Name text,
                    EmbeddedEntities embedded_entity[]
                );
            ");

            // table with composite array column
            await setupConnection.ExecuteAsync(@"
                CREATE TABLE ""ParentEntities"" (
                    ""Id"" SERIAL PRIMARY KEY,
                    ""Name"" VARCHAR(255) NOT NULL,
                    ""EmbeddedEntities"" embedded_entity[] NOT NULL
                );
            ");

            await setupConnection.ExecuteAsync(@"
                INSERT INTO ""ParentEntities"" (""Name"", ""EmbeddedEntities"")
                SELECT 
                    'Parent ' || generate_series,
                    ARRAY[
                        ROW(100::bigint, 200::bigint, 300::bigint)::embedded_entity,
                        ROW(400::bigint, 500::bigint, 600::bigint)::embedded_entity
                    ]::embedded_entity[]
                FROM generate_series(1, 100000);
            ");

            await using var dataSource = new NpgsqlDataSourceBuilder(connectionString)
                .MapComposite<EmbeddedEntity>("embedded_entity")
                .MapComposite<ParentEntity>("parent_entity")
                .Build();
            
            await using var readConnection = await dataSource.OpenConnectionAsync();
            var allEntities = await readConnection.QueryAsync<ParentEntity>(@"
                SELECT * FROM ""ParentEntities""
            ");

            var data = allEntities.ToList();
           
        }
        finally
        {
            // await postgresContainer.StopAsync();
        }
    }
}

Questions:

  1. Is the lack of caching for CompositeBuilder intentional?
  2. Would it be acceptable to introduce per-type caching (e.g. static or thread-static) for builders?
  3. Are there known concerns around thread safety or memory retention that prevented caching?
  4. As a workaround I wrote custom converter - was it the only way for me to avoid allocations of CompositeBuilders?

Thanks for your work on Npgsql!

Image