◐ Shell
clean mode source ↗

feat: Updating retrieve online documents v2 to work for other fields for sq… by franciscojavierarceo · Pull Request #5082 · feast-dev/feast

What this PR does / why we need it:

This PR enables full text search for the retrieve_online_documents/ endpoint for SQLite Vec. It also establishes a new parameter in the SDK method called query_string that can be passed to use key word search. There are a number of limitations with this approach as the top_k parameter can be misleading (as evident by the example). This offers a good start for keyword search that leverages the existing vector retrieval endpoint. As a next step, enabling hybrid search would be beneficial.

It makes keyword search as simple as:

results = store.retrieve_online_documents_v2(
    features=[
        "document_embeddings:Embeddings",
        "document_embeddings:content",
        "document_embeddings:title",
    ],
    query=query_embedding,
    query_string="(content: 5) OR (title: 1) OR (title: 3)",
    top_k=3,
).to_dict()
print(results)
  • feature_server.py:
    • Added optional query_string parameter to the GetOnlineFeaturesRequest class.
    • Updated retrieve_online_documents to support the query_string parameter.
  • feature_store.py:
    • Added optional query_string parameter to retrieve_online_documents_v2.
    • Updated related methods to handle query_string.
  • feature_view.py:
    • Added an assertion to ensure only one vector feature per feature view.
  • milvus.py:
    • Added optional query_string parameter to retrieve_online_documents_v2.
  • online_store.py:
    • Added optional query_string parameter to retrieve_online_documents_v2.
  • sqlite.py:
    • Extensive changes to support text search with BM25, including adding text_search_enabled configuration and handling query_string.
    • Updated SQL operations to support the new functionalities.
  • passthrough_provider.py and provider.py:
    • Updated retrieve_online_documents_v2 to support the query_string.
  • types.py:
    • Added FEAST_VECTOR_TYPES list for handling vector types.
  • example_feature_repo_1.py:
    • Added content and title fields to an example feature view.

Which issue(s) this PR fixes:

#5081
#5073

Misc