◐ Shell
reader mode source ↗
Skip to content

feat: Add skip_feature_view_validation parameter to push() for ODFVs with missing UDF dependencies#5894

Draft
Copilot wants to merge 11 commits into
masterfrom
copilot/fix-module-not-found-error
Draft

feat: Add skip_feature_view_validation parameter to push() for ODFVs with missing UDF dependencies#5894
Copilot wants to merge 11 commits into
masterfrom
copilot/fix-module-not-found-error

Conversation

Copilot AI commented Jan 23, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

store.push() fails with ModuleNotFoundError when On-Demand Feature Views contain UDFs that reference modules unavailable in the current environment (e.g., training code not deployed to serving infrastructure). This blocks data ingestion in production environments.

Changes

Transformation deserialization bypass:

  • PandasTransformation.from_proto() and PythonTransformation.from_proto() accept skip_udf parameter
  • Returns explicit identity functions instead of deserializing with dill.loads() when enabled
  • Preserves type signatures for both DataFrame and Dict transformations

Call chain propagation:

  • push(), push_async(), write_to_online_store(), write_to_online_store_async() accept skip_feature_view_validation parameter
  • Flows through list_all_feature_views() → registry → proto_registry_utils
  • OnDemandFeatureView.from_proto() propagates skip_udf to transformation parsing

Selective UDF skipping (Critical safeguard):

  • skip_feature_view_validation only skips UDF deserialization for ODFVs with write_to_online_store=False
  • ODFVs with write_to_online_store=True always load their actual UDFs since they will be executed during push operations
  • This prevents hiding legitimate errors where the UDF is actually needed for transformation execution
  • Logic implemented in list_on_demand_feature_views() to check write_to_online_store flag before applying skip_udf

Caching strategy:

  • Modified list_all_feature_views() and list_on_demand_feature_views() to support conditional caching
  • When skip_udf=False (default): Uses cached versions (_list_all_feature_views_cached, _list_on_demand_feature_views_cached) with @registry_proto_cache_with_tags decorator for performance
  • When skip_udf=True: Bypasses caching to prevent cache pollution with dummy UDFs
  • Maintains backward compatibility by preserving caching behavior for default case
  • The @registry_proto_cache_with_tags decorator was moved from the public functions to new internal cached versions because the decorator's signature only supports 3 parameters (registry_proto, project, tags) and cannot accommodate the additional skip_udf parameter

API Consistency:

Usage

# Before: Fails if UDF references unavailable modules
store.push("push_source", df)

# After: Skips UDF deserialization for ODFVs that won't execute transformations
store.push("push_source", df, skip_feature_view_validation=True)

Important: ODFVs with write_to_online_store=True will always have their UDFs deserialized even when skip_feature_view_validation=True, as their transformations are executed during push. Only ODFVs that don't execute transformations during push can safely skip UDF loading.

All parameters default to False for backward compatibility. No API breaking changes.

Which issue(s) this PR fixes:

Addresses ModuleNotFoundError when pushing data with ODFVs containing UDFs that reference environment-specific modules.

Misc

  • CodeQL scan: 0 alerts
  • Added unit tests for transformation skip logic and parameter propagation
  • Added test test_skip_feature_view_validation_only_applies_to_non_writing_odfvs() to validate that write_to_online_store=True ODFVs always load real UDFs
  • Follows pattern suggested by @franciscojavierarceo for consistency with apply() method
  • Parameter naming matches PR feat: Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() #5859 (skip_feature_view_validation) for API consistency across all FeatureStore methods
  • Caching implementation reviewed and confirmed to maintain performance for default use case while preventing cache pollution when skip_feature_view_validation is enabled
  • Follow-up issue recommended for cleaning up async methods (write_to_online_store_async, push_async) as async functionality should be server-specific going forward
Original prompt

This section details on the original issue you should resolve

<issue_title>Read On-Demand Feature View and deserialization while pushing data</issue_title>
<issue_description> ## Description

A ModuleNotFoundError occurs when calling store.push() to ingest data into the Online Store. The error is triggered when Feast attempts to synchronize the registry and encounters an On-Demand Feature View.

Because Feast uses dill to serialize/deserialize User Defined Functions (UDFs), it fails if the execution environment lacks the specific Python module (in this case, training) that was present when the UDF was originally defined and registered.
🔍 Error Traceback

File "/usr/local/lib/python3.12/site-packages/data_dataflow/core/io.py", line 201, in process
    self.store.push(push_source_name, df, to=PushMode.ONLINE)  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1698, in push
    self.write_to_online_store(
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1946, in write_to_online_store
    feature_view, df = self._get_feature_view_and_df_for_online_write(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 1904, in _get_feature_view_and_df_for_online_write
    for fv_proto in self.list_all_feature_views(allow_registry_cache)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 294, in list_all_feature_views
    return self._list_all_feature_views(allow_cache, tags=tags)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/feature_store.py", line 269, in _list_all_feature_views
    for fv in self.registry.list_all_feature_views(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/registry.py", line 647, in list_all_feature_views
    return proto_registry_utils.list_all_feature_views(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 67, in wrapper
    cache_value = func(registry_proto, project, tags)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 243, in list_all_feature_views
    + list_on_demand_feature_views(registry_proto, project, tags)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 67, in wrapper
    cache_value = func(registry_proto, project, tags)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/proto_registry_utils.py", line 285, in list_on_demand_feature_views
    OnDemandFeatureView.from_proto(on_demand_feature_view)
  File "/usr/local/lib/python3.12/site-packages/feast/on_demand_feature_view.py", line 400, in from_proto
    transformation = PandasTransformation.from_proto(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/feast/transformation/pandas_transformation.py", line 150, in from_proto
    udf=dill.loads(user_defined_function_proto.body),
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 311, in loads
    return load(file, ignore, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 297, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 452, in load
    obj = StockUnpickler.load(self)
          ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/dill/_dill.py", line 442, in find_class
    return StockUnpickler.find_class(self, module, name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'training'

Steps to Reproduce

- Define an OnDemandFeatureView in an environment where put transformation code to a local module.

- The UDF inside this view references a function, class, or constant from the training module.

- Run feast apply to save the definition to the remote registry (e.g., S3, GCS, or SQL).

- Run store.push() from a different environment that does not have the training module installed in its Python path.

Root Cause Analysis

This is a serialization dependency issue. When dill (the library Feast uses for pickling) serializes a function, it often stores references to the modules where global variables or dep...


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.


Open with Devin

Copilot AI and others added 4 commits January 23, 2026 06:39
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix ModuleNotFoundError in On-Demand Feature View Jan 23, 2026
Copilot AI and others added 2 commits January 23, 2026 18:48
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>
@franciscojavierarceo

Copy link
Copy Markdown
Member

@copilot maybe we should word it as skip_feature_view_validation to be consistent with #5859

…tency

Co-authored-by: franciscojavierarceo <4163062+franciscojavierarceo@users.noreply.github.com>

Copilot AI commented Jan 24, 2026

Copy link
Copy Markdown
Contributor Author

@copilot maybe we should word it as skip_feature_view_validation to be consistent with #5859

Renamed in commit 885536e. Changed all occurrences of skip_validation to skip_feature_view_validation to match the naming convention from PR #5859.

Copilot AI changed the title feat: Add skip_validation parameter to push() for ODFVs with missing UDF dependencies Jan 24, 2026

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Devin Review found 1 potential issue.

View issue and 4 additional flags in Devin Review.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Read On-Demand Feature View and deserialization while pushing data

2 participants