◐ Shell
clean mode source ↗

Incorrect path in file data source to s3 bucket

Hi there! I was trying to implenet a feature store, created repo, added feature.parquet to the s3 bucket, made 'feast apply', then tried to get features in notebook via feature store. But it seems to be bugged.

Expected Behavior

DF with features.

Current Behavior

Error:
An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: [WinError 123] Failed querying information for path 'c:/Users/nboyarkin/Downloads/scm_forecast-1/notebooks/s3:/analytics-ds-dev-spark-upload-files/features/year.parquet'.

Steps to reproduce

fs = FeatureStore(fs_yaml_file='C:/Users/nboyarkin/Downloads/feast.yaml')

entity_df = pd.DataFrame.from_dict(
{
# entity's join key -> entity values
"store_id": [12,],
"product_id": [27279,],
# "event_timestamp" (reserved key) -> timestamps
"event_timestamp": [
datetime(2024, 11, 1),
],
}
)

training_df = fs.get_historical_features(
entity_df=entity_df,
features=[
"calendar_stats:year",
],
).to_df()

Specifications

  • Version: 0.42.0
  • Platform: windows
  • Subsystem:

Possible Solution

In feast\infra\offline_stores\dask.py, line 529, change:
if not Path(data_source.path).is_absolute()
to
if not Path(data_source.path).is_absolute() and not Path(data_source.path).parts[0] == 's3:':