◐ Shell
reader mode source ↗
Skip to content

fix: Update the pyarrow to latest v14.0.1 regarding the CVE-2023-47248.#3835

Closed
shuchu wants to merge 14 commits into
feast-dev:masterfrom
shuchu:issue-3832
Closed

fix: Update the pyarrow to latest v14.0.1 regarding the CVE-2023-47248.#3835
shuchu wants to merge 14 commits into
feast-dev:masterfrom
shuchu:issue-3832

Conversation

@shuchu

@shuchu shuchu commented Nov 14, 2023

Copy link
Copy Markdown
Collaborator

What this PR does / why we need it:
Update the pyarrow to latest version v14.0.1 which has the fix for CVE-2023-47248

Fixes #3832

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@shuchu

shuchu commented Nov 14, 2023

Copy link
Copy Markdown
Collaborator Author

A little bit worried about the unit test coverage.

please be aware that I unpin the pyarrow version.

py3.8-requirements.txt and py3.8-ci-requirements.txt were updated manually. (regarding the DASK version issue for python 3.8)

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
…le write to temporary parquet file.

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
…ng pyarrow v10.0.1

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@shuchu

shuchu commented Nov 16, 2023

Copy link
Copy Markdown
Collaborator Author

I meet a very interesting problem. I only update the Pyarrow version and snowflake api, the integration test results show me that the timestamp range is error while running Redshift SQL query.

{ error:  Timestamp out of range.\n  
              code:      8001\n  }

it happens while run "get_historical_features()" and the timestamp range were inferenced from the "entity_df":
as in redshift.py::_get_entity_df_event_timestamp_range().
f"SELECT MIN({entity_df_event_timestamp_col}) AS min, MAX({entity_df_event_timestamp_col}) AS max "

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@shuchu

shuchu commented Nov 17, 2023

Copy link
Copy Markdown
Collaborator Author

please do not merge this PR. @sudohainguyen
It's in a mess status and is for debugging only now.

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@sudohainguyen

Copy link
Copy Markdown
Collaborator

No worries, looking forward to seeing this works

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@shuchu

shuchu commented Nov 17, 2023

Copy link
Copy Markdown
Collaborator Author

Finally, I found the fix way. It's about the setting of parameter "coerce_timestamps" of "pyarrow.parquet.write_table".

Let me close this PR and create a clean new one.

@sudohainguyen

Copy link
Copy Markdown
Collaborator

Great @shuchu !!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security vulnerability of python package: pyarrow (CVE-2023-47248)

3 participants