feat: Add UUID and TIME_UUID as feature types (#5885)#5951
Conversation
1d4cd01 to
4a5c932
Compare
February 8, 2026 15:00
|
@soooojinlee , thanks so much for putting this together! Can you rebase to bring this PR up to date? |
Sorry, something went wrong.
2c56521 to
1fd106a
Compare
February 11, 2026 04:44
26198a6 to
525ac72
Compare
February 11, 2026 14:14
cb6dd44 to
54c2eae
Compare
February 12, 2026 05:55
2dd648a to
b280364
Compare
February 16, 2026 06:44
|
@soooojinlee , can you also add support for UUID_SET and TIME_UUID_SET? |
Sorry, something went wrong.
|
@soooojinlee , can you also update the docs with the newly supported types? |
Sorry, something went wrong.
180434b to
41cfe7e
Compare
February 17, 2026 04:26
I added UUID_SET / TIME_UUID_SET support and updated the type system documentation as requested in here 41cfe7e |
Sorry, something went wrong.
07fbde4 to
6aa853a
Compare
March 27, 2026 14:49
|
@soooojinlee , can we update this and merge it before the VALUE_SET and VALUE_LIST change? |
Sorry, something went wrong.
nquinn408
left a comment
There was a problem hiding this comment.
Looks good. Thanks!
Sorry, something went wrong.
Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as dedicated oneof fields in the Value proto message, replacing the previous reuse of string_val/string_list_val. This allows UUID types to be identified from the proto field alone without requiring a feature_types side-channel. Backward compatibility is maintained for data previously stored as string_val. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Signed-off-by: soojin <soojin@dable.io>
Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip support, backward compatibility, and documentation for all UUID types. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ype mappings Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of renumbering them. Shift UUID types to 32-37 in both proto and Python enum. Also add missing SET type entries in _convert_value_type_str_to_value_type(), convert_array_column(), and _get_sample_values_by_type() for completeness. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The comment claimed Sets do not support UUID/TimeUuid but the code intentionally allows them. Updated to reflect actual behavior. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o top Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rialization Return UUID proto fields as plain strings instead of falling through to feast_value_type_to_python_type which converts them to uuid.UUID objects that are not JSON-serializable, causing TypeError during HTTP transport. Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
Add [misc] error code to type: ignore comments in UUID list/set proto conversion to satisfy mypy's stricter checking. Signed-off-by: Soojin Lee <soooojin.lee@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io>
3fcfead to
7b600cb
Compare
April 1, 2026 12:31
5d6e311
into
feast-dev:master
Apr 1, 2026
…-dev#5951) * feat: Add UUID and TIME_UUID as feature types (feast-dev#5885) Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * test: Add unit tests for UUID type support Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * style: Fix ruff lint and formatting issues Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * feat: Add dedicated UUID/TIME_UUID proto fields to Value.proto Add uuid_val, time_uuid_val, uuid_list_val, time_uuid_list_val as dedicated oneof fields in the Value proto message, replacing the previous reuse of string_val/string_list_val. This allows UUID types to be identified from the proto field alone without requiring a feature_types side-channel. Backward compatibility is maintained for data previously stored as string_val. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback for UUID type support Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback for UUID type support Signed-off-by: soojin <soojin@dable.io> * fix: Address review feedback Signed-off-by: soojin <soojin@dable.io> * fix: Convert uuid.UUID to string for Arrow and JSON serialization Signed-off-by: soojin <soojin@dable.io> * feat: Add UUID_SET/TIME_UUID_SET support and update type system docs Add Set(Uuid) and Set(TimeUuid) as feature types with full roundtrip support, backward compatibility, and documentation for all UUID types. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Preserve PDF_BYTES/IMAGE_BYTES enum values and add missing SET type mappings Keep PDF_BYTES=30 and IMAGE_BYTES=31 at their upstream values instead of renumbering them. Shift UUID types to 32-37 in both proto and Python enum. Also add missing SET type entries in _convert_value_type_str_to_value_type(), convert_array_column(), and _get_sample_values_by_type() for completeness. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Correct misleading comment in Set.__init__ The comment claimed Sets do not support UUID/TimeUuid but the code intentionally allows them. Updated to reflect actual behavior. Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: Extract UUID Arrow conversion into helper and move import to top Signed-off-by: soojin <soojin@dable.io> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: Handle UUID types in _proto_value_to_transport_value for JSON serialization Return UUID proto fields as plain strings instead of falling through to feast_value_type_to_python_type which converts them to uuid.UUID objects that are not JSON-serializable, causing TypeError during HTTP transport. Signed-off-by: soojin <soojin@dable.io> * chore: Regenerate protobuf files with UUID type support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> * fix: Fix mypy type ignore comments for UUID collection conversions Add [misc] error code to type: ignore comments in UUID list/set proto conversion to satisfy mypy's stricter checking. Signed-off-by: Soojin Lee <soooojin.lee@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: soojin <soojin@dable.io> --------- Signed-off-by: soojin <soojin@dable.io> Signed-off-by: Soojin Lee <soooojin.lee@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: yuanjun220 <1069645408@qq.com>
# [0.61.0](v0.60.0...v0.61.0) (2026-04-07) ### Bug Fixes * Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a)) * Add https readiness check for rest-registry tests ([ea85e63](ea85e63)) * Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43)) * Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7)) * Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361)) * Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8)) * Fix integration tests ([#6046](#6046)) ([02d5548](02d5548)) * Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce)) * Fix non-specific label selector on metrics service ([a1a160d](a1a160d)) * fix path feature_definitions.py ([7d7df68](7d7df68)) * Fix regstry Rest API tests intermittent failure ([d53a339](d53a339)) * Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148)) * Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7)) * Fixed pre-commit check ([114b7db](114b7db)) * Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7)) * Fixed uv cache permission error for docker build on mac ([ad807be](ad807be)) * Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a)) * Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b)) * Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d)) * Integration test failures ([#6040](#6040)) ([9165870](9165870)) * Mount TLS volumes for init container ([080a9b5](080a9b5)) * **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066) * Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a)) * Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8)) * SSL/TLS mode by default for postgres connection ([4844488](4844488)) * Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7)) ### Features * Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007) * Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf)) * Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf)) * Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43)) * Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728) * Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18)) * Add metadata statistics to registry api ([ef1d4fc](ef1d4fc)) * Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835) * Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462) * Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368)) * Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6)) * Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21)) * Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116) * Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311)) * Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3)) * Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2)) * Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8)) * Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516)) * Added odfv transformations metrics ([8b5a526](8b5a526)) * Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7)) * Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06)) * Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026)) * Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability) * **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a)) * Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6)) * Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7)) * Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c)) * Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9)) * Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570)) * Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983)) * Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642)) * Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1)) * Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a)) * Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32)) ### Performance Improvements * Online feature response construction in a single pass over read rows ([113fb04](113fb04)) * Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8)) * Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95)) * Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944)) * Remove redundant entity key serialization in online_read ([d87283f](d87283f))
# [0.62.0](v0.61.0...v0.62.0) (2026-04-08) ### Bug Fixes * Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7)) * Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce)) * fix path feature_definitions.py ([7d7df68](7d7df68)) * Fix regstry Rest API tests intermittent failure ([d53a339](d53a339)) * Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7)) * Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7)) * Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b)) * Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d)) * Mount TLS volumes for init container ([080a9b5](080a9b5)) * **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066) * SSL/TLS mode by default for postgres connection ([4844488](4844488)) * Sync v0.61-branch so v0.61.0 tag is reachable from master ([af66878](af66878)) ### Features * Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007) * Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf)) * Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43)) * Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728) * Add metadata statistics to registry api ([ef1d4fc](ef1d4fc)) * Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368)) * Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6)) * Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21)) * Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116) * Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311)) * Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3)) * Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2)) * Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8)) * Added odfv transformations metrics ([8b5a526](8b5a526)) * Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06)) * Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026)) * Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c)) * Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570)) * Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983)) * Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642)) * Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1)) * Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32)) ### Performance Improvements * Online feature response construction in a single pass over read rows ([113fb04](113fb04))
What this PR does / why we need it:
Adds
UUIDandTIME_UUIDas native Feast feature types, resolving #5885. Currently UUID values must be stored as STRING, which loses type semantics, prevents backend-specific features (e.g. Cassandra timeuuid range queries), and makes PostgreSQLuuidcolumns infer as STRING. This PR enables users to declare UUID features withField(name="user_id", dtype=Uuid)and receiveuuid.UUIDobjects fromget_online_features().to_dict().Design Decisions
Why two types (UUID vs TIME_UUID)?
The issue author explicitly requested distinguishing time-based UUID (uuid1) and random UUID (uuid4). Both serialize
identically to
stringin proto, but separate types allow expressing intent in feature definitions and enable future backend-specific optimizations.Why dedicated proto fields (
uuid_val,time_uuid_val)?Following the pattern established by SET types (PR #5888) and UNIX_TIMESTAMP (which reuses
int64/Int64List), we add dedicated oneof fields that reuse existing proto scalar types (stringandStringList). This allowsWhichOneof("val")to identify UUID types directly from the proto message, without requiring a side-channel.Backward compatibility for data stored before this change:
OnlineResponseaccepts an optionalfeature_typesdict. When data was previously stored asstring_val, this metadata enablesfeast_value_type_to_python_type()to convert it touuid.UUID. New materializations useuuid_val/time_uuid_valand are identified automatically.Changes
Value.proto, generated*_pb2.py/*_pb2.pyiUUID=30,TIME_UUID=31,UUID_LIST=32,TIME_UUID_LIST=33toValueType.Enum; adduuid_val,time_uuid_val,uuid_list_val,time_uuid_list_valtoValue.oneofvalue_type.py,types.pyUUID,TIME_UUID,UUID_LIST,TIME_UUID_LISTenums andUuid/TimeUuidaliasestype_map.pystring_valtouuid_val; addPROTO_VALUE_TO_VALUE_TYPE_MAPentries for UUID fieldsonline_response.py,online_store.py,feature_store.py,utils.pyfeature_typesmetadata for backward-compatible deserializationon_demand_feature_view.pyBackward Compatibility
string_valstill deserializes correctly via thefeature_typesside-channeluuid_val/time_uuid_valproto fieldsfeast_value_type_to_python_type(v)withoutfeature_typenow returnsuuid.UUIDforuuid_valfields (previously returned plain string forstring_val)ValueType.UUID(previouslyValueType.STRING)Tests
test_types.py: Uuid/TimeUuid ↔ ValueType bidirectional conversion, Array typestest_type_map.py: Proto roundtrip withuuid_val,uuid.UUIDobject return, backward compatibility forstring_val, UUID list roundtrip, PostgreSQL mapping