◐ Shell
reader mode source ↗
Skip to content

feat: Unify Feature Transformations and Feature Views#5747

Open
franciscojavierarceo wants to merge 34 commits into
masterfrom
refactor-odfv
Open

feat: Unify Feature Transformations and Feature Views#5747
franciscojavierarceo wants to merge 34 commits into
masterfrom
refactor-odfv

Conversation

@franciscojavierarceo

@franciscojavierarceo franciscojavierarceo commented Nov 28, 2025

Copy link
Copy Markdown
Member

What this PR does / why we need it:

This PR refactors the transformation system with a cleaner architecture that separates transformation logic from execution. Transformations should focus on HOW to transform data, while FeatureViews handle WHEN and WHERE to execute.

Changes:

  • Added: feature_transformation field to FeatureView for transformation logic
  • Added: transform: bool = True parameter to API methods for per-call control
  • Added: Schema validation for pre-transformed data (transform=False)
  • Added: Transformation execution logic in online/offline feature retrieval
  # Define transformation logic
  @transformation(mode="python")
  def driver_features(inputs):
      return {"conv_plus_acc": [c + a for c, a in zip(inputs["conv_rate"], inputs["acc_rate"])]}

  # Create FeatureView with transformation
  fv = FeatureView(
      name="driver_computed_features",
      source=driver_source,
      entities=[driver],
      schema=[Field(name="conv_plus_acc", dtype=Float64)],
      feature_transformation=driver_features,  # ← NEW
      online=True,
  )

  # API-level transformation control
  # Execute transformations (default)
  features = store.get_online_features(
      features=["driver_computed_features:conv_plus_acc"],
      entity_rows=entity_rows,
      transform=True  # ← NEW (default)
  )

  # Skip transformations for external batch jobs
  features = store.get_online_features(
      features=["driver_computed_features:conv_plus_acc"],
      entity_rows=spark_transformed_rows,
      transform=False  # ← NEW: skip + validate schema
  )

Which issue(s) this PR fixes:

#4584, #5716, #5689

Misc

@franciscojavierarceo franciscojavierarceo requested a review from a team as a code owner November 28, 2025 18:20
@franciscojavierarceo franciscojavierarceo force-pushed the refactor-odfv branch 2 times, most recently from 2899413 to 649c429 Compare December 1, 2025 21:47
@HaoXuAI

HaoXuAI commented Dec 16, 2025

Copy link
Copy Markdown
Collaborator

I think when can be a bit confusing. even for 'batch', it can be 'on_read' or 'on_write'. I think on a different angle, should transformation be in the scope of writing, and isn't transformation only responsible for transforming the data?

FeatureView as the actually data assets, I think makes more sense to having the API.

@franciscojavierarceo

Copy link
Copy Markdown
Member Author

I think when can be a bit confusing. even for 'batch', it can be 'on_read' or 'on_write'. I think on a different angle, should transformation be in the scope of writing, and isn't transformation only responsible for transforming the data?

FeatureView as the actually data assets, I think makes more sense to having the API.

Yeah, that's fair. @HaoXuAI I updated the PR, lmk what you think.

@franciscojavierarceo franciscojavierarceo requested a review from a team as a code owner January 5, 2026 04:17
@franciscojavierarceo franciscojavierarceo requested review from ejscribner, robhowley and shuchu and removed request for a team January 5, 2026 04:17
@franciscojavierarceo franciscojavierarceo force-pushed the refactor-odfv branch 2 times, most recently from 5abbec2 to e5f074e Compare January 5, 2026 15:18
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
… auto-inference

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
franciscojavierarceo and others added 14 commits January 5, 2026 11:42
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
… store (#5807)

* Update redis.py

Add millisecond-precision timestamp support to Redis online store

Signed-off-by: Jatin Kumar <jatink.5251@gmail.com>

* Update redis.py

sub-second precision when returning timestamps to client

Signed-off-by: Jatin Kumar <jatink.5251@gmail.com>

* Update redis.py

fix(redis): preserve millisecond timestamp precision

Signed-off-by: Jatin Kumar <jatink.5251@gmail.com>

* Update redis.py

fix: Remove whitespace on blank lines (W293)

Signed-off-by: Jatin Kumar <jatink.5251@gmail.com>

---------

Signed-off-by: Jatin Kumar <jatink.5251@gmail.com>
Signed-off-by: samuelkim7 <samuel.kim@goflink.com>
* chore: Refactor some unit tests into integration tests

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* chore: Refactor some unit tests into integration tests

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* rename TestConfig

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* rename TestConfig

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* add integration flag

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* update paths

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

* update paths

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Srihari <svenkata@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants