feat: Add possibility to materialize only latest values, to increase performance#5713
Conversation
…terialization logic (calling it) Signed-off-by: lukas.valatka <lukas.valatka@cast.ai>
Signed-off-by: lukas.valatka <lukas.valatka@cast.ai>
Sorry, something went wrong.
Sorry, something went wrong.
Signed-off-by: lukas.valatka <lukas.valatka@cast.ai>
…e' of github.com:astronautas/feast into feat/add-selective-deduplicate-pushdown-to-offline-store
|
Let's re-run tests? Random issue, but no changes to dependency management :/ |
Sorry, something went wrong.
Sorry, something went wrong.
|
seems like aws creds have expired @franciscojavierarceo |
Sorry, something went wrong.
HaoXuAI
left a comment
There was a problem hiding this comment.
I think it might be better to add the config to the fs.materialize API? So that you can customize the materialize process that materialize the FeatureView if you need pushdown filter, and some other process you don't need.
Sorry, something went wrong.
Why not indeed. I'll check it out and tag you back. |
Sorry, something went wrong.
8d77b72
into
feast-dev:master
Nov 11, 2025
…performance (#5713) * add pull_all_from_table_or_query for clickhouse, to align with new materialization logic (calling it) Signed-off-by: lukas.valatka <lukas.valatka@cast.ai> * add option to select to materialize only latest values, for performance Signed-off-by: lukas.valatka <lukas.valatka@cast.ai> * enforce non optional params Signed-off-by: lukas.valatka <lukas.valatka@cast.ai> --------- Signed-off-by: lukas.valatka <lukas.valatka@cast.ai> Co-authored-by: Lukas Valatka <lukas@valatka.net>
# [0.57.0](v0.56.0...v0.57.0) (2025-11-13) ### Bug Fixes * Improve trino to feast type mapping with (real,varchar,timestamp,decimal) ([#5691](#5691)) ([f855ad2](f855ad2)) * Materialize API - ODFV views not looked-up (thinks views non existant) - crashes materialize ([#5716](#5716)) ([1b050b3](1b050b3)) * Support historical feature retrieval with start_date/end_date in RemoteOfflineStore ([#5703](#5703)) ([ad32756](ad32756)) * Thread safe Clickhouse offline store ([#5710](#5710)) ([5f446ed](5f446ed)) ### Features * Add annotations to cronjob CRDs ([#5701](#5701)) ([be6e6c2](be6e6c2)) * Add batch commit mode for MySQL OnlineStore ([#5699](#5699)) ([3cfe4eb](3cfe4eb)) * Add possibility to materialize only latest values, to increase performance ([#5713](#5713)) ([8d77b72](8d77b72)) * Support table format: Iceberg, Delta, and Hudi ([#5650](#5650)) ([2915ad1](2915ad1))

What this PR does / why we need it:
Adds an option to materialize only the latest values (essentially pushes down deduplication to offline store), to reduce client memory consumption and reduce e2e duration. Especially noticeable for large-scale materialization - think hundreds of thousands of rows with ~150 feature views, with latency-critical materializations - as we observed in our ML project at cast.ai.
Which issue(s) this PR fixes:
#5707 (comment)
Misc
This will be configured via feature store (repo) config file: