feat: Add non-entity retrieval support for ClickHouse offline store by YassinNouh21 · Pull Request #6066 · feast-dev/feast
Enable get_historical_features() to be called with entity_df=None by passing end_date kwarg instead. When entity_df is None, a synthetic single-row DataFrame is created using end_date (defaults to now). The PIT join window is controlled by end_date and TTL. Includes integration test against a real ClickHouse container. Fixes feast-dev#5835 Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
YassinNouh21 added a commit to YassinNouh21/feast that referenced this pull request
…_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
ntkathole pushed a commit to YassinNouh21/feast that referenced this pull request
…_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
franciscojavierarceo added a commit that referenced this pull request
…rieval (#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR #6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> --------- Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com>
Anarion-zuo pushed a commit to Anarion-zuo/feast that referenced this pull request
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> --------- Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: aaronzuo <anarionzuo@outlook.com>
Shizoqua pushed a commit to Shizoqua/feast that referenced this pull request
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> --------- Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: Shizoqua <hr.lanreshittu@gmail.com>
aniketpalu pushed a commit to aniketpalu/feast that referenced this pull request
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> --------- Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: Aniket Paluskar <apaluska@redhat.com>
yuan1j pushed a commit to yuan1j/feast that referenced this pull request
…rieval (feast-dev#6110) * fix(postgres): Use end_date instead of start_date in synthetic entity_df for non-entity retrieval The non-entity retrieval path created a synthetic entity_df using pd.date_range(start=start_date, ...)[:1], which placed start_date as the event_timestamp. Since PIT joins use MAX(entity_timestamp) as the upper bound for feature data filtering, using start_date made end_date unreachable — no features after start_date would be returned. Fix: use [end_date] directly, matching the ClickHouse implementation (PR feast-dev#6066) and the Dask offline store behavior. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * fix: preserve timestamp range for min_event_timestamp and fix formatting The entity_df fix alone would cause min_event_timestamp to be computed as end_date - TTL (instead of start_date - TTL), clipping valid data from the query window. Override entity_df_event_timestamp_range to (start_date, end_date) in non-entity mode so the full range is used. Also fix ruff formatting in the test file. Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> * test: add integration test for non-entity retrieval Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> --------- Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: yuanjun220 <1069645408@qq.com>
franciscojavierarceo pushed a commit that referenced this pull request
# [0.61.0](v0.60.0...v0.61.0) (2026-04-07) ### Bug Fixes * Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a)) * Add https readiness check for rest-registry tests ([ea85e63](ea85e63)) * Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43)) * Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7)) * Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361)) * Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8)) * Fix integration tests ([#6046](#6046)) ([02d5548](02d5548)) * Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce)) * Fix non-specific label selector on metrics service ([a1a160d](a1a160d)) * fix path feature_definitions.py ([7d7df68](7d7df68)) * Fix regstry Rest API tests intermittent failure ([d53a339](d53a339)) * Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148)) * Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7)) * Fixed pre-commit check ([114b7db](114b7db)) * Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7)) * Fixed uv cache permission error for docker build on mac ([ad807be](ad807be)) * Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a)) * Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b)) * Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d)) * Integration test failures ([#6040](#6040)) ([9165870](9165870)) * Mount TLS volumes for init container ([080a9b5](080a9b5)) * **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066) * Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a)) * Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8)) * SSL/TLS mode by default for postgres connection ([4844488](4844488)) * Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7)) ### Features * Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007) * Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf)) * Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf)) * Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43)) * Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728) * Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18)) * Add metadata statistics to registry api ([ef1d4fc](ef1d4fc)) * Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835) * Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462) * Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368)) * Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6)) * Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21)) * Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116) * Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311)) * Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3)) * Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2)) * Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8)) * Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516)) * Added odfv transformations metrics ([8b5a526](8b5a526)) * Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7)) * Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06)) * Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026)) * Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability) * **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a)) * Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6)) * Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7)) * Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c)) * Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9)) * Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570)) * Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983)) * Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642)) * Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1)) * Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a)) * Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32)) ### Performance Improvements * Online feature response construction in a single pass over read rows ([113fb04](113fb04)) * Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8)) * Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95)) * Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944)) * Remove redundant entity key serialization in online_read ([d87283f](d87283f))