feat: Add Vector Search support to MongoDBOnlineStore by caseyclements · Pull Request #6344 · feast-dev/feast
This was referenced
- Extend MongoDBOnlineStoreConfig with VectorStoreConfig (vector_enabled, similarity, vector_index_wait_timeout) - Auto-create/drop Atlas vector search indexes in update() for feature views with vector_index=True fields - Implement retrieve_online_documents_v2 using \ aggregation - Add MongoDBAtlasOnlineStoreCreator using MongoDBAtlasLocalContainer from testcontainers-python fork - Add integration tests for index lifecycle, write+retrieve round-trip, top_k limiting, and teardown cleanup Signed-off-by: Casey Clements <casey.clements@mongodb.com>
…_WAIT - Explicitly create collection in _ensure_vector_indexes before calling create_search_index (Atlas requires it to exist) - Restructure tests to share store instance and setup via module-scoped fixture - Add INDEX_WAIT (default 5s) for Atlas Search eventual consistency after writes Signed-off-by: Casey Clements <casey.clements@mongodb.com>
- Test write_data now uses np.float32 to match Array(Float32) schema - retrieve_online_documents_v2 coerces embedding to native Python floats before passing to \, avoiding BSON encoding errors with numpy float types Signed-off-by: Casey Clements <casey.clements@mongodb.com>
…tion test - Add unit tests for retrieve_online_documents_v2 error paths: vector_enabled=False, embedding=None, no vector_index fields, missing vector_length - Add idempotency integration test: calling update() twice does not duplicate indexes - Add vector_index_wait_poll_interval config option - Change _wait_for_index_ready to raise TimeoutError instead of silently continuing on timeout Signed-off-by: Casey Clements <casey.clements@mongodb.com>
…ntainer Replace MongoDbContainer with MongoDBAtlasLocalContainer in MongoDB offline store unit tests. This uses the mongodb/mongodb-atlas-local image which includes Atlas Search services, enabling future vector search testing. - Bump testcontainers from 4.9.0 to 4.15.0rc2 - Switch test fixture to MongoDBAtlasLocalContainer - Simplify connection string fixture to use get_connection_url() Signed-off-by: Casey Clements <casey.clements@mongodb.com>
bool is a subclass of int in Python, so isinstance(True, int) returns True. Move the bool check before int so boolean values are correctly converted to ValueProto(bool_val=...) instead of ValueProto(int64_val=...). Signed-off-by: Casey Clements <casey.clements@mongodb.com>
Add docstring to _ensure_vector_indexes explaining the current one-index-per-field design and noting that a single composite index with multiple field definitions would reduce cluster-wide index count and memory overhead. Signed-off-by: Casey Clements <casey.clements@mongodb.com>
make lock-python-dependencies-all and pixi lock Signed-off-by: Casey Clements <casey.clements@mongodb.com>
The testcontainers upgrade changed the default MySQL dialect, causing 'No module named MySQLdb' errors. Explicitly set dialect='pymysql' in all MySqlContainer instantiations. Signed-off-by: Casey Clements <casey.clements@mongodb.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters