◐ Shell
clean mode source ↗

Roadmap — CSharpDB

Planned direction for CSharpDB — organized by timeframe and priority. Reflects the current v3.8.0 state.

Need the full source guide? The original long-form markdown version is preserved as Roadmap Source Reference.

Near-Term Completed

Recently completed improvements to query performance, storage behavior, provider/tooling compatibility, maintenance workflows, and developer ergonomics.

No-reflection, trim-safe typed collection API via CSharpDB.Generators with GetGeneratedCollectionAsync<T>(), GeneratedCollection<T>, generated field metadata, binary direct payloads for supported shapes, and NativeAOT-friendly model registration.

Separated collection write probes from the read-side B-tree routing-cache, reused traversal scratch during insert/replace, and buffered catalog mutation bookkeeping inside explicit transactions.

Recovered covered composite-index lookup optimization for queries that can be answered entirely from the index without touching the base table.

Configurable durable commit batch window to coalesce WAL fsync calls across concurrent transactions for higher write throughput.

Deduplicate SELECT output with DISTINCT. Multi-column indexes for broader query coverage.

Use indexes for <, >, <=, >=, BETWEEN — not just equality lookups.

Cache parsed ASTs and query plans to avoid re-parsing identical SQL statements.

Open a database fully in memory, load from disk, and save committed snapshots back to disk.

Nested scalar, array-element, nested array-object, Guid, temporal, and ordered text path indexes.

Merge underflowed pages on delete to reclaim space via borrow/merge with interior collapse.

Maintenance report, REINDEX, VACUUM/compact, fragmentation analysis, and database size report.

CSharpDB.Daemon host with full gRPC coverage for SQL, schema, procedures, collections, and maintenance.

Incremental/sliced auto-checkpointing to move work off the triggering commit path.

Lazy-resident durable storage with on-demand page loading and gRPC tunable file-cache.

ANALYZE command with persisted row counts, column NDV/min/max, and initial stats-guided index selection.

BackupAsync / RestoreAsync as first-class operations across direct, HTTP, gRPC, CLI, and Admin.

Native .csdbtable snapshots with fast Admin Import / Export, download or server-path destinations, CREATE EXTERNAL TABLE, sys.external_tables, read-only scans/joins, and embedded primary-key lookup indexes.

Validate/apply maintenance workflow that rewrites existing child tables with persisted FK metadata across direct, HTTP, gRPC, CLI, and Admin.

Visual banded-report designer with grouping, sorting, expressions, aggregate functions, page settings, and printable preview.

Mid-Term In Progress

SQL feature parity, provider/tooling compatibility, and ecosystem expansion.

Done for the trusted in-process model: host-registered C# scalar functions, common SQL/Admin built-ins, trusted commands, Admin Forms/Reports/pipeline hooks, declarative form action sequences, and local Admin Forms C# code modules. Untrusted sandboxed UDF execution is intentionally out of scope.

Opt-in writable external table registrations over mutable .csdbx files, backed by CSharpDB B+tree storage and limited to INSERT, UPDATE, and DELETE in v1 while .csdbtable archives remain read-only.

ROW_NUMBER(), RANK(), DENSE_RANK(), LEAD(), LAG() for analytical queries.

Default expressions in column definitions and arbitrary expression-based constraints per column or table.

v1 support for single-column, column-level REFERENCES with optional ON DELETE CASCADE, plus metadata/tooling surfaces.

CSharpDB.Daemon now hosts the existing REST/HTTP /api surface and gRPC from one long-running process backed by the same warm daemon-hosted client. Standalone CSharpDB.Api remains supported for REST-only hosting.

Opt-in API-key mode protects REST /api/* and daemon gRPC calls with constant-time key comparison while keeping default no-auth behavior for compatibility.

Authorization, protected admin endpoint scopes, JWT/RBAC options, and TLS/mTLS deployment helpers for remote HTTP and gRPC access.

CSharpDB.Daemon can be packaged as a persistent background service across systemd, Windows Service, and launchd.

Self-contained daemon archives and install scripts ship for Windows, Linux, and macOS; dotnet tool, Docker, Homebrew, and winget distribution remain future work.

DbConnection.GetSchema() now exposes standard metadata collections for tooling and ORM schema discovery.

BINARY, NOCASE, NOCASE_AI, and ICU:<locale> collation now work across SQL and collection indexes; dedicated ordered SQL text index optimization remains future work.

Scalar subqueries, IN/EXISTS (including correlated), UNION, INTERSECT, EXCEPT across SELECT results.

Admin query builder with source canvas, join editing, design grid, SQL preview, and saved layouts.

Long-Term Future

Advanced features and fundamental architecture enhancements, including long-range items that have since shipped.

Inverted index support with tokenization, stemming, and relevance ranking.

Current phase is complete: opt-in generated models provide GetGeneratedCollectionAsync<T>, generated descriptors/index bindings, binary direct payloads for supported shapes, JSON fallback for unsupported shapes, and trim/NativeAOT smoke coverage.

Streamline NuGet/analyzer packaging, templates, onboarding docs, and project setup for the opt-in generated collection path.

Expand generator support beyond the current scalar, scalar collection, nested scalar, and nested collection-scalar shapes.

Internal row-batch transport serves as the batch-first SQL execution foundation across batch-capable result boundaries, scans, joins, and generic aggregates.

Follow writable .csdbx storage with broader external-table indexes, planner costing, and multi-column lookup/range support beyond the current archive primary-key point-lookup path.

Deep engine/page compression remains planned; application-level payload compression is available as a sample/SDK pattern without changing the storage format.

Encrypt database and WAL files with passphrase-based key management and explicit plaintext/encrypted migration/export paths; implementation must meet the database-encryption plan entry criteria before shipping.

Current phase is complete: ANALYZE-driven stats-guided costing uses internal histograms, heavy hitters, composite-prefix summaries, skew-aware estimates, correlation-aware filters/joins, non-unique lookup costing, hash build-side choice, and bounded DP join reordering.

Current phase is complete: opt-in adaptive join execution can switch eligible index nested-loop joins to hash joins and flip inner hash build sides at safe pre-emission boundaries.

Stable SQL-first diagnostics expose sys.planner_histograms, sys.planner_heavy_hitters, sys.planner_index_prefix_stats, and EXPLAIN ESTIMATE FOR <query>.

Current phase is complete: WAL frame-chunk writes, chunked checkpoint page copies, shared snapshot/export batching, reusable B-tree copy utilities, and the close-out audit cover the main storage and maintenance write paths.

Advisory planner-stat persistence can stay deferred without weakening committed-row durability, and sys.table_stats.row_count_is_exact makes exact versus estimated row-count semantics explicit.

Opt-in UseDurableCommitBatchWindow(...) batches durable WAL flushes across contending in-process transactions — an expert measure-first knob rather than default behavior.

Explicit WriteTransaction conflict-detected retry flow, shared auto-commit non-insert isolation, and opt-in ConcurrentWriteTransactions for shared implicit inserts.

Opt-in concurrent write transactions now reserve shared row-id ranges and rebase hot right-edge insert pages against pending WAL images for improved insert fan-in.

Route API/daemon requests across multiple warm CSharpDB database files so independent tenants or shard keys can use separate WAL and commit paths, with v1 focused on single-shard writes and point reads.

Retained commit-log change feeds and reactive query subscriptions for read replicas, live Admin views, and event-driven applications.

Current Limitations

Known simplifications in the current implementation:

AreaLimitation
Functions and automationCSharpDB's UDF/command model is trusted and in-process by design. Current supported surfaces include host-registered scalar functions, common built-ins, trusted commands, form/report/pipeline hooks, declarative action sequences, and local Admin Forms C# modules; untrusted sandboxed execution is intentionally out of scope
QueryScalar/IN/EXISTS subqueries are supported, including correlated cases in WHERE, non-aggregate projection, and UPDATE/DELETE expressions; correlated subqueries are not yet supported in JOIN ON, GROUP BY, HAVING, ORDER BY, or aggregate projections
QueryUNION, INTERSECT, and EXCEPT are supported; UNION ALL is not implemented yet
QueryNo window functions
SchemaNo SQL DEFAULT column values or CHECK constraints yet. Foreign keys are currently v1 only: single-column, column-level REFERENCES with optional ON DELETE CASCADE; table-level/composite/deferred foreign keys and ON UPDATE actions are not implemented
IndexesEquality lookups support current INTEGER/TEXT indexes, but ordered range-scan pushdown is still limited to single-column INTEGER index paths
RowIdLegacy table schemas without persisted high-water metadata may pay a one-time key scan on first insert
CollectionsFindByIndexAsync supports declared field-equality lookups; FindByPathAsync and FindByPathRangeAsync support path-based queries on indexed paths; FindAsync remains a full scan for unindexed predicates. Generated collections require registered descriptors for existing collection indexes; unsupported generated model shapes warn and use the source-generated JSON fallback instead of binary direct payloads
External TablesNative .csdbtable archives can be registered and queried as read-only external tables. Writable external tables are planned as an opt-in .csdbx format; current archives remain read-only, and broader external indexes, range seeks, and deeper planner costing remain planned
NetworkingCSharpDB.Daemon now hosts both REST and gRPC from one process; named pipes remain reserved but are not implemented end to end today
SecurityRemote REST and daemon gRPC support opt-in API-key authentication, defaulting to None for compatibility. JWT, RBAC, mTLS helpers, TLS-specific configuration, and at-rest encryption are not implemented
Admin FormsThe Forms designer/runtime supports the core generated-form and data-entry path plus trusted command-backed automation, including lifecycle events, command buttons, selected-control events, conditional UI rules, domain formula helpers, declarative action sequences, and local C# code modules. It still needs Access-parity work for responsive runtime rendering, complete inferred validation, richer form modes, additional events, advanced filtering/sorting, report/query/import/export actions, macro loops/on-error/temp vars, and broader controls
Admin ReportsThe Reports designer/runtime supports the core banded preview path plus trusted command-backed preview lifecycle events, but still needs Access-parity work for bounded saved-query previews, full report output/export, parameters, richer grouping and totals semantics, conditional formatting, subreports, and broader controls
Text / MultilingualText is stored as UTF-8 and supports all Unicode languages; default semantics remain ordinal, but opt-in BINARY, NOCASE, NOCASE_AI, and ICU:<locale> collation are implemented for SQL and collection indexes. Dedicated ordered SQL text index optimization remains planned
ConcurrencyPhysical WAL commit path is still serialized at the storage boundary. Initial multi-writer support is shipped, but observed gains depend on conflict shape and whether shared auto-commit INSERT is left on the default serialized path
StorageNo page-level compression; the compression SDK sample stores compressed payloads as ordinary application-managed BLOB values
StorageNo at-rest encryption for database/WAL files; on-disk storage is plaintext only
StorageMemory-mapped reads are opt-in and currently apply only to clean main-file pages; WAL-backed reads still rely on the WAL/cache path
StorageBy default, durable auto-commit single-row writes still pay a physical WAL flush per commit; opt-in UseDurableCommitBatchWindow(...) can trade some commit latency for higher throughput
QueryPhase-2 cost-based planning is in place: ANALYZE, sys.table_stats, sys.column_stats, public planner-stat diagnostics, histogram/heavy-hitter/prefix estimates, and bounded small-chain join reordering now feed join/access-path costing. Opt-in adaptive join re-optimization can react to stale-stat or parameter-sensitive join cardinality misses, while broader runtime actuals, EXPLAIN ANALYZE, and full mid-plan reordering remain future work
QueryInternal row-batch transport is now the default scan-heavy execution foundation across batch-capable scans, joins, aggregates, and result boundaries; remaining work is broader kernel specialization and optional SIMD-style tuning rather than missing core batch coverage

Completed Milestones

Major features already implemented and shipped:

Single-file database with 4 KB page-oriented storage

B+tree-backed tables and secondary indexes

Write-Ahead Log with crash recovery and auto-checkpoint

Concurrent snapshot-isolated readers via WAL-based MVCC

Full SQL pipeline: tokenizer, parser, planner, operators

JOINs (INNER, LEFT, RIGHT, CROSS), aggregates, GROUP BY, HAVING, CTEs

UNION, INTERSECT, EXCEPT set operations

Scalar/IN/EXISTS subqueries (incl. correlated) in filters, projections, and UPDATE/DELETE

Scalar TEXT(expr) for filter-friendly text coercion

Composite (multi-column) indexes

Ordered integer index range scans in the fast lookup path

ANALYZE with persisted table/column stats and stale-aware refresh

Phase-2 cost-based query planning: statistics-guided access paths, join method/reordering, histogram/cardinality estimation

Public planner diagnostics with EXPLAIN ESTIMATE and sys.planner_* catalogs

Opt-in adaptive join re-optimization for eligible stale-stat and parameter-sensitive joins

SELECT DISTINCT and DISTINCT aggregates

SQL statement and SELECT plan caching

First-class IDENTITY / AUTOINCREMENT support for INTEGER PRIMARY KEY columns

Persisted table NextRowId high-water mark with compatibility fallback

Batch-first SQL row-batch execution across scans, joins, aggregates, and result boundaries

Views and triggers (BEFORE/AFTER on INSERT/UPDATE/DELETE)

Foreign key constraints: single-column REFERENCES with optional ON DELETE CASCADE

Older-database foreign-key retrofit migration across direct, HTTP, gRPC, CLI, and Admin

ADO.NET provider with connection pooling and GetSchema metadata collections

In-memory database mode with explicit load/save APIs

Shared/private in-memory ADO.NET connections with named shared-memory hosts

Document Collection API with typed Put/Get/Delete/Scan/Find

Collection secondary field indexes via EnsureIndexAsync / FindByIndexAsync

Binary direct-payload collection storage with direct hydration and field/path extraction

Collection path indexes: nested scalar, array-element, nested array-object, Guid, temporal, ordered text

Collection path query APIs: FindByPathAsync and FindByPathRangeAsync

Source-generated typed collection fast path with trim-safe NativeAOT-friendly access

Full-text search with tokenization, stemming, and relevance ranking

Hybrid storage mode with lazy-resident durable storage and gRPC tunable file-cache

Client-wide BackupAsync / RestoreAsync across direct, HTTP, gRPC, CLI, and Admin

Native .csdbtable table archives with Admin Import / Export and read-only external table registration

ReplaceAsync for index stores

Maintenance report, REINDEX, and VACUUM flows across client, CLI, API, and Admin UI

Dedicated gRPC daemon host

Remote host consolidation in CSharpDB.Daemon, with REST /api and gRPC sharing one warm hosted database client

Opt-in API-key protection for REST /api/* and daemon gRPC calls

Daemon service packaging with self-contained archives and service install assets

Storage tuning presets, bounded WAL read caching, memory-mapped reads, and sliced background checkpointing

SQL executor/read-path fast paths for compact projections, broader join/index coverage, and correlated subquery filters

REST API with 34+ endpoints and OpenAPI/Scalar documentation

Blazor Server admin dashboard with Forms and Reports designers

Trusted C# callbacks, commands, Admin automation hooks, and local Admin Forms C# code modules

Interactive CLI with meta-commands and file execution

Package-driven ETL pipelines with validation, dry-run, execute/resume, and Admin visual designer

VS Code extension with schema explorer

MCP server for AI assistant integration

NativeAOT C library for cross-language FFI

B+tree delete rebalancing with underflow handling

Reusable snapshot reader sessions for higher concurrent-read throughput

Comprehensive benchmark suite (micro, macro, stress, scaling, in-memory, shared-memory)

Collection write-path performance recovery with separated read/write B-tree routing

Covered composite-index fast-path optimization

Durable-write commit batching for higher concurrent write throughput