◐ Shell
clean mode source ↗

feat: Support nested collection types (Array/Set of Array/Set) (#5947) by soooojinlee · Pull Request #6132 · feast-dev/feast

devin-ai-integration[bot]

soooojinlee added a commit to soooojinlee/feast that referenced this pull request

Mar 28, 2026
…e VALUE_LIST/VALUE_SET

Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38,
SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that
use RepeatedValue to enable unlimited nesting depth. This is a breaking change
for an unreleased feature, as suggested in PR feast-dev#6132 review.

Key changes:
- Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39
- Python: Update ValueType enum, type system, serialization, field persistence
- JSON: Update proto_json encode/decode for new field names
- Tests: Rewrite all nested collection tests (204 tests passing)
- Docs: Update type-system.md for recursive design

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

soooojinlee added a commit to soooojinlee/feast that referenced this pull request

Mar 28, 2026
…e VALUE_LIST/VALUE_SET

Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38,
SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that
use RepeatedValue to enable unlimited nesting depth. This is a breaking change
for an unreleased feature, as suggested in PR feast-dev#6132 review.

Key changes:
- Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39
- Python: Update ValueType enum, type system, serialization, field persistence
- JSON: Update proto_json encode/decode for new field names
- Tests: Rewrite all nested collection tests (204 tests passing)
- Docs: Update type-system.md for recursive design

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

nquinn408

devin-ai-integration[bot]

devin-ai-integration[bot]

soooojinlee added a commit to soooojinlee/feast that referenced this pull request

Apr 1, 2026
…e VALUE_LIST/VALUE_SET

Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38,
SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that
use RepeatedValue to enable unlimited nesting depth. This is a breaking change
for an unreleased feature, as suggested in PR feast-dev#6132 review.

Key changes:
- Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39
- Python: Update ValueType enum, type system, serialization, field persistence
- JSON: Update proto_json encode/decode for new field names
- Tests: Rewrite all nested collection tests (204 tests passing)
- Docs: Update type-system.md for recursive design

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

nquinn408

…-dev#5947)

Add support for 2-level nested collection types: Array(Array(T)),
Array(Set(T)), Set(Array(T)), and Set(Set(T)).

- Add 4 generic ValueType enums (LIST_LIST, LIST_SET, SET_LIST, SET_SET)
  backed by RepeatedValue proto messages
- Persist inner type info in Field tags (feast:nested_inner_type),
  following the existing Struct schema tag pattern
- Handle edge cases: empty inner collections, Set dedup at inner level,
  depth limit enforcement (2 levels max)
- Add proto/JSON/remote transport serialization support
- Add 25 unit tests covering all combinations and edge cases

Signed-off-by: Soojin Lee <lsjin0602@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
- Fix remote online store read path to use declared feature types from
  FeatureView instead of ValueType.UNKNOWN, which fails for nested
  collection types (LIST_LIST, LIST_SET, SET_LIST, SET_SET)
- Add Nested Collection Types section to type-system.md with type table,
  usage examples, and empty-inner-collection→None limitation docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
…for nested collection types

- Add nested list handling in proto_json from_json_object (list of lists
  was raising ParseError since no branch matched list-typed elements)
- Fix pa_to_feast_value_type to recognize nested list PyArrow types
  (list<item: list<item: T>>) instead of crashing with KeyError
- Replace silent String fallback in _str_to_feast_type with ValueError
  to surface corrupted tag values instead of silently losing type info
- Strengthen test coverage: type str roundtrip, inner value verification,
  multi-value batch, proto JSON roundtrip, PyArrow nested type inference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
Use getattr/CopyFrom instead of **dict unpacking for ProtoValue
construction to satisfy mypy's strict type checking.

Signed-off-by: soojin <soojin@dable.io>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
…n edge case

- Add __eq__/__hash__ to Array and Set so inner element types are compared
  (previously Array(Array(String)) == Array(Array(Int32)) was True)
- Fix nested collection detection in proto_json when first element is None
  by using any() fallback instead of only checking value[0]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
… coverage

- Remove 2-level depth restriction from Array and Set constructors
  to support unbounded nesting per maintainer request
- Make _convert_nested_collection_to_proto() recursive for 3+ levels
- Update error message for nested type inference to guide users
  toward explicit Field dtype declaration
- Add 3+ level tests for Field roundtrip, str roundtrip, and PyArrow conversion

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
…e VALUE_LIST/VALUE_SET

Replace 4 combinatorial enum values (LIST_LIST=36, LIST_SET=37, SET_LIST=38,
SET_SET=39) with 2 recursive enum values (VALUE_LIST=40, VALUE_SET=41) that
use RepeatedValue to enable unlimited nesting depth. This is a breaking change
for an unreleased feature, as suggested in PR feast-dev#6132 review.

Key changes:
- Proto: Remove 4 enum/oneof fields, add VALUE_LIST/VALUE_SET with reserved 36-39
- Python: Update ValueType enum, type system, serialization, field persistence
- JSON: Update proto_json encode/decode for new field names
- Tests: Rewrite all nested collection tests (204 tests passing)
- Docs: Update type-system.md for recursive design

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
…imize JSON nested list detection

- Add _parse_pa_type_str() to reconstruct PyArrow types from type strings
  for VALUE_LIST/VALUE_SET, avoiding lossy round-trip through placeholder
- Optimize proto_json nested list detection: only scan with any() when
  first element is None, avoiding O(n) scan for flat lists
- Add warning log for unrecognized PyArrow type strings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>
… clarify placeholder pyarrow type

- Add np.ndarray to isinstance check in _convert_nested_collection_to_proto
  to fix KeyError for 3+ level nesting during materialization (PyArrow produces
  np.ndarray, not Python list)
- Add comment clarifying VALUE_LIST/VALUE_SET placeholder in feast_value_type_to_pa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: soojin <soojin@dable.io>

franciscojavierarceo

@franciscojavierarceo

franciscojavierarceo pushed a commit that referenced this pull request

Apr 7, 2026
# [0.61.0](v0.60.0...v0.61.0) (2026-04-07)

### Bug Fixes

* Add grpcio dependency group to transformation server Dockerfile ([2c2150a](2c2150a))
* Add https readiness check for rest-registry tests ([ea85e63](ea85e63))
* Add website build check for PRs and fix blog frontmatter YAML error ([#6079](#6079)) ([30a3a43](30a3a43))
* Added missing jackc/pgx/v5 entries ([94ad0e7](94ad0e7))
* Added MLflow metric charts across feature selection ([#6080](#6080)) ([a403361](a403361))
* Check duplicate names for feature view across types ([#5999](#5999)) ([95b9af8](95b9af8))
* Fix integration tests ([#6046](#6046)) ([02d5548](02d5548))
* Fix missing error handling for resource_counts endpoint ([d9706ce](d9706ce))
* Fix non-specific label selector on metrics service ([a1a160d](a1a160d))
* fix path feature_definitions.py ([7d7df68](7d7df68))
* Fix regstry Rest API tests intermittent failure ([d53a339](d53a339))
* Fixed IntegrityError on SqlRegistry ([#6047](#6047)) ([325e148](325e148))
* Fixed intermittent failures in get_historical_features ([c335ec7](c335ec7))
* Fixed pre-commit check ([114b7db](114b7db))
* Fixed the intermittent FeatureViewNotFoundException ([661ecc7](661ecc7))
* Fixed uv cache permission error for docker build on mac ([ad807be](ad807be))
* Fixes a `PydanticDeprecatedSince20` warning for trino_offline_store ([#5991](#5991)) ([abfd18a](abfd18a))
* Handle existing RBAC role gracefully in namespace registry ([b46a62b](b46a62b))
* Ignore ipynb files during apply ([#6151](#6151)) ([4ea123d](4ea123d))
* Integration test failures ([#6040](#6040)) ([9165870](9165870))
* Mount TLS volumes for init container ([080a9b5](080a9b5))
* **postgres:** Use end_date in synthetic entity_df for non-entity retrieval ([#6110](#6110)) ([088a802](088a802)), closes [#6066](#6066)
* Ray offline store tests are duplicated across 3 workflows ([54f705a](54f705a))
* Reenable tests ([#6036](#6036)) ([82ee7f8](82ee7f8))
* SSL/TLS mode by default for postgres connection ([4844488](4844488))
* Use commitlint pre-commit hook instead of a separate action ([35a81e7](35a81e7))

### Features

* Add Claude Code agent skills for Feast ([#6081](#6081)) ([1e5b60f](1e5b60f)), closes [#5976](#5976) [#6007](#6007)
* Add complex type support (Map, JSON, Struct) with schema validation ([#5974](#5974)) ([1200dbf](1200dbf))
* Add decimal to supported feature types ([#6029](#6029)) ([#6226](#6226)) ([cff6fbf](cff6fbf))
* Add feast apply init container to automate registry population on pod start ([#6106](#6106)) ([6b31a43](6b31a43))
* Add feature view versioning support to PostgreSQL and MySQL online stores ([#6193](#6193)) ([940e0f0](940e0f0)), closes [#6168](#6168) [#6169](#6169) [#2728](#2728)
* Add materialization, feature freshness, request latency, and push metrics to feature server ([2c6be18](2c6be18))
* Add metadata statistics to registry api ([ef1d4fc](ef1d4fc))
* Add non-entity retrieval support for ClickHouse offline store ([4d08ddc](4d08ddc)), closes [#5835](#5835)
* Add OnlineStore for MongoDB ([#6025](#6025)) ([bf4e3fa](bf4e3fa)), closes [golang/go#74462](golang/go#74462)
* Add Oracle DB as Offline store in python sdk & operator ([#6017](#6017)) ([9d35368](9d35368))
* Add RBAC aggregation labels to FeatureStore ClusterRoles ([daf77c6](daf77c6))
* Add ServiceMonitor auto-generation for Prometheus discovery ([#6126](#6126)) ([56e6d21](56e6d21))
* Add typed_features field to grpc write request (([#6117](#6117)) ([#6118](#6118)) ([eeaa6db](eeaa6db)), closes [#6116](#6116)
* Add UUID and TIME_UUID as feature types ([#5885](#5885)) ([#5951](#5951)) ([5d6e311](5d6e311))
* Add version indicators to lineage graph nodes ([#6187](#6187)) ([73805d3](73805d3))
* Add version tracking to FeatureView ([#6101](#6101)) ([ed4a4f2](ed4a4f2))
* Added Agent skills for AI Agents ([#6007](#6007)) ([99008c8](99008c8))
* Added CodeQL SAST scanning and detect-secrets pre-commit hook ([547b516](547b516))
* Added odfv transformations metrics ([8b5a526](8b5a526))
* Adding optional name to Aggregation (feast-dev[#5994](#5994)) ([#6083](#6083)) ([56469f7](56469f7))
* Created DocEmbedder class ([#5973](#5973)) ([0719c06](0719c06))
* Extended OIDC support to extract groups & namespaces and token injection with multiple methods ([#6089](#6089)) ([7c04026](7c04026))
* Feature Server High-Availability on Kubernetes ([#6028](#6028)) ([9c07b4c](9c07b4c)), closes [Hi#Availability](https://github.com/Hi/issues/Availability) [Hi#Availability](https://github.com/Hi/issues/Availability)
* **go:** Implement metrics and tracing for http and grpc servers ([#5925](#5925)) ([2b4ec9a](2b4ec9a))
* Horizontal scaling support to the Feast operator ([#6000](#6000)) ([3ec13e6](3ec13e6))
* Making feature view source optional (feast-dev[#6074](#6074)) ([#6075](#6075)) ([76917b7](76917b7))
* Replace ORJSONResponse with Pydantic response models for faster JSON serialization ([65cf03c](65cf03c))
* Support arm docker build ([#6061](#6061)) ([1e1f5d9](1e1f5d9))
* Support distinct count aggregation [[#6116](#6116)] ([3639570](3639570))
* Support HTTP in MCP ([#6109](#6109)) ([e72b983](e72b983))
* Support nested collection types (Array/Set of Array/Set) ([#5947](#5947)) ([#6132](#6132)) ([ab61642](ab61642))
* Support podAnnotations on Deployment pod template ([1b3cdc1](1b3cdc1))
* Use orjson for faster JSON serialization in feature server ([6f5203a](6f5203a))
* Utilize date partition column in BigQuery ([#6076](#6076)) ([4ea9b32](4ea9b32))

### Performance Improvements

* Online feature response construction in a single pass over read rows ([113fb04](113fb04))
* Optimize protobuf parsing in Redis online store ([#6023](#6023)) ([59dfdb8](59dfdb8))
* Optimize timestamp conversion in _convert_rows_to_protobuf ([33a2e95](33a2e95))
* Parallelize DynamoDB batch reads in sync online_read ([#6024](#6024)) ([9699944](9699944))
* Remove redundant entity key serialization in online_read ([d87283f](d87283f))