feat: Updating retrieve online documents v2 to work for other fields for sq… by franciscojavierarceo · Pull Request #5082 · feast-dev/feast
What this PR does / why we need it:
This PR enables full text search for the retrieve_online_documents/ endpoint for SQLite Vec. It also establishes a new parameter in the SDK method called query_string that can be passed to use key word search. There are a number of limitations with this approach as the top_k parameter can be misleading (as evident by the example). This offers a good start for keyword search that leverages the existing vector retrieval endpoint. As a next step, enabling hybrid search would be beneficial.
It makes keyword search as simple as:
results = store.retrieve_online_documents_v2( features=[ "document_embeddings:Embeddings", "document_embeddings:content", "document_embeddings:title", ], query=query_embedding, query_string="(content: 5) OR (title: 1) OR (title: 3)", top_k=3, ).to_dict() print(results)
feature_server.py:- Added optional query_string parameter to the GetOnlineFeaturesRequest class.
- Updated retrieve_online_documents to support the query_string parameter.
feature_store.py:- Added optional query_string parameter to retrieve_online_documents_v2.
- Updated related methods to handle query_string.
feature_view.py:- Added an assertion to ensure only one vector feature per feature view.
milvus.py:- Added optional query_string parameter to retrieve_online_documents_v2.
online_store.py:- Added optional query_string parameter to retrieve_online_documents_v2.
sqlite.py:- Extensive changes to support text search with BM25, including adding text_search_enabled configuration and handling query_string.
- Updated SQL operations to support the new functionalities.
passthrough_provider.pyandprovider.py:- Updated retrieve_online_documents_v2 to support the query_string.
types.py:- Added FEAST_VECTOR_TYPES list for handling vector types.
example_feature_repo_1.py:- Added content and title fields to an example feature view.