Bug: Regression in BigQuery offline store caused by newer pydantic versions
Expected Behavior
Calls to store = FeatureStore(repo_path=repo_path) for a BigQuery offline store should just work.
Current Behavior
Running store = FeatureStore(repo_path=repo_path) for a BigQuery offline store causes:
Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: 'pydantic_core._pydantic_core.ValidationInfo' object is not subscriptable
This happens on line 108 in bigquery.py#L108.
Likely cause
pydantic>=2.0.0 changed the ValidationInfo object.
The setup.py for feast >= v0.36.0 requires that pydantic>=2.0.0.
Last known successful configuration:
- pydantic==1.10.16
- feast v.0.35.0 - the setup.py for feast v0.35.0 allows for
pydantic>=1,<2.
Steps to reproduce
Example feature_store.yaml:
project: my_feature_repo registry: gs://.../registry.db # The provider AWS used for the online store. # Mixing AWS and GCP should be okay provided that offline store type is specified. provider: aws offline_store: type: bigquery billing_project_id: some-project-id dataset: latest gcs_staging_location: gs://... location: EU project_id: some-project-id online_store: type: redis connection_string: ... entity_key_serialization_version: 2
import os # Credentials for GCP os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="GCP_keyfile.json" from feast import FeatureStore, RepoConfig if __name__ == "__main__": # repo_path must contain the GCP_keyfile.json and feature_store.yaml files repo_path = "./" assert os.path.exists(repo_path) # Create a feature store object store = FeatureStore(repo_path=repo_path) # <--- Error occurs here
Specifications
- Version: feast v0.38.0
- Platform: x86-64 Ubuntu 22.04.3 LTS as well as arm64 Mac OS Sonoma 14.4.1
- Subsystem:
Possible Solution
On first try, using values.data["project_id"] instead of values["project_id"] in the pydantic.field_validator for billing_project_id in bigquery.py#L108 should work:
@field_validator("billing_project_id") def project_id_exists(cls, v, values, **kwargs): # if v and not values["project_id"]: if v and not values.data["project_id"]: raise ValueError( "please specify project_id if billing_project_id is specified" ) return v
More testing is needed before I can make a PR.