ci: streamline GitHub Actions test coverage and runtime
Context
The current GitHub Actions setup gives us good coverage, but it is doing more work than we probably need on every PR and master push. Recent master runs show the largest costs are in the master-only integration/build workflow, with full Python integration across 3.10/3.11/3.12 and a very slow feature-server-dev Docker build. PR CI also has some low-signal jobs that look redundant with other checks.
Observations
unit-testsruns the full Python unit suite across Python 3.10, 3.11, 3.12 on Ubuntu plus 3.11 and 3.12 on macOS.- Several files under
sdk/python/tests/unitbehave more like local functional/integration tests: subprocess CLI calls,feast init/apply, Spark setup, Docker/testcontainers, feature server startup, etc. smoke_tests.ymlinstalls full CI dependencies on Python 3.10/3.11/3.12 just to importfeast.cli.check_skip_tests.ymlexists as a reusable docs/examples/community skip gate, but the main unit/linter/smoke workflows do not call it.master_only.ymlruns full integration plus benchmark upload across the same Python version matrix.- The
feature-server-devDocker build is the master workflow long pole and should get its own cache/layering/toolchain audit.
Suggested Cleanup Tracks
-
Low-risk CI hygiene
- Collapse or remove the redundant smoke workflow.
- Add path/doc-only skipping for low-risk workflows.
- Move benchmark publishing out of blocking master integration, or limit it to one Python version.
-
Test-suite restructuring
- Keep pure unit tests as the default fast PR gate.
- Move or mark local functional tests separately: CLI subprocess, feature repo apply/init, Spark, Docker/testcontainers, and feature server startup.
- Run that local functional slice on one Python/OS combination.
-
Matrix policy
- Keep full unit coverage on one primary Python version.
- Use smaller compatibility smoke/subset checks for other Python versions and macOS.
- Run full compatibility coverage nightly or on release-sensitive paths.
-
Master integration/build improvements
- Consider full integration on Python 3.11, with smaller compatibility coverage for 3.10/3.12.
- Split benchmark collection into a scheduled or non-blocking job.
- Audit
feature-server-devDocker build caching, Node version alignment, image context, and layer ordering.
This issue is intentionally broad so contributors can pick off independent pieces without needing to solve the whole CI design in one PR.