Python Feature Server
Relevant source files
- CONTRIBUTING.md
- README.md
- docs/SUMMARY.md
- docs/getting-started/quickstart.md
- docs/getting-started/third-party-integrations.md
- docs/how-to-guides/running-feast-in-production.md
- docs/project/maintainers.md
- docs/reference/feature-servers/offline-feature-server.md
- docs/reference/feature-servers/python-feature-server.md
- docs/reference/online-stores/README.md
- docs/roadmap.md
- examples/quickstart/quickstart.ipynb
- infra/feast-operator/README.md
- infra/feast-operator/config/samples/kustomization.yaml
- infra/templates/README.md.jinja2
- sdk/python/feast/feature_server.py
- sdk/python/feast/ui_server.py
- sdk/python/tests/unit/test_feature_server.py
- sdk/python/tests/unit/test_ui_server.py
Overview
The Python Feature Server is a FastAPI-based HTTP service that wraps the FeatureStore class to provide REST API access to Feast feature operations. It enables non-Python clients to retrieve online features, push streaming data, trigger materialization, and manage feature store operations via HTTP requests.
Key characteristics:
- Protocol: HTTP REST (FastAPI)
- Implementation: sdk/python/feast/feature_server.py1-1006
- Entry point:
feast serveCLI command - Default port: 6566
- Process model: Gunicorn + UvicornWorker (Linux/macOS) or Uvicorn (Windows)
Related servers:
- Go Feature Server (page 4.2): High-performance native implementation
- Java Feature Server (page 4.3): JVM-based gRPC implementation
- Offline Feature Server (page 4.4): Arrow Flight server for historical retrieval
- Web UI Server: React-based feature discovery interface
Sources: sdk/python/feast/feature_server.py1-50 sdk/python/feast/cli/serve.py
Architecture
The Python Feature Server is a stateless HTTP service that delegates requests to a FeatureStore instance. The server uses FastAPI for HTTP handling and maintains background threads for registry refresh and optional write batching.
Component Architecture
Python Feature Server Component Structure
Sources: sdk/python/feast/feature_server.py211-319 sdk/python/feast/feature_store.py124-169 sdk/python/feast/infra/passthrough_provider.py58-90
Request Flow
Online Feature Retrieval Flow: /get-online-features
Sources: sdk/python/feast/feature_server.py323-351 sdk/python/feast/feature_server.py132-162 sdk/python/feast/infra/passthrough_provider.py239-258
Process Initialization
Feature Server Startup Flow
Sources: sdk/python/feast/feature_server.py211-319 sdk/python/feast/feature_server.py697-714 sdk/python/feast/feature_server.py733-805
Configuration
The feature server is configured through the feature_store.yaml file. The RepoConfig class (sdk/python/feast/repo_config.py253-364) loads this configuration and provides it to the FeatureStore instance.
Feature Server Configuration
The feature_server field in feature_store.yaml is optional and supports additional configuration for advanced features:
The configuration class is resolved via the FEATURE_SERVER_CONFIG_CLASS_FOR_TYPE dictionary (sdk/python/feast/repo_config.py109-112):
| Type | Configuration Class |
|---|---|
"local" | feast.infra.feature_servers.local_process.config.LocalFeatureServerConfig |
"mcp" | feast.infra.mcp_servers.mcp_config.McpFeatureServerConfig |
The RepoConfig validates and instantiates the appropriate configuration class via _validate_feature_server_config().
Offline Write Batching
The feature server supports batching offline writes to improve throughput when using the /push endpoint with to: "offline" or to: "online_and_offline". The batching configuration is read during get_app() initialization (sdk/python/feast/feature_server.py251-285):
When enabled, offline writes are queued in an OfflineWriteBatcher instance and flushed either when the batch size threshold is reached or after the configured time interval. The batcher runs in a dedicated background thread (sdk/python/feast/feature_server.py814-938).
Sources: sdk/python/feast/repo_config.py253-364 sdk/python/feast/repo_config.py109-112 sdk/python/feast/feature_server.py251-285
Key Configuration Fields
| Field | Type | Description | Default |
|---|---|---|---|
project | string | Project identifier | Required |
registry | RegistryConfig | Registry configuration (file/SQL/remote) | Required |
provider | string | Provider type (local, gcp, aws, azure) | "local" |
online_store | OnlineStoreConfig | Online store configuration | sqlite |
offline_store | OfflineStoreConfig | Offline store configuration | dask |
auth | AuthConfig | Authentication configuration | no_auth |
feature_server | FeatureServerConfig | Feature server-specific config | None |
Sources: sdk/python/feast/repo_config.py253-364 sdk/python/feast/repo_config.py109-112
Example Configurations
Local Development:
Production with SQL Registry:
Sources: docs/getting-started/quickstart.md106-118 sdk/python/feast/repo_config.py170-183
Starting the Server
CLI Command
Startup sequence:
feast serveCLI (sdk/python/feast/cli/serve.py) loadsfeature_store.yamlviacreate_feature_store()- Calls
start_server()(sdk/python/feast/feature_server.py733-805) - Platform check:
- Non-Windows:
FeastServeApplication(Gunicorn + UvicornWorker) (sdk/python/feast/feature_server.py697-714) - Windows: Direct
uvicorn.run()(sdk/python/feast/feature_server.py793-805)
- Non-Windows:
get_app()builds FastAPI application (sdk/python/feast/feature_server.py211-666)- Server listens on port 6566 (configurable via
--port)
Sources: sdk/python/feast/cli/serve.py sdk/python/feast/feature_server.py733-805 sdk/python/feast/feature_server.py697-714
REST API Endpoints
The FastAPI application (sdk/python/feast/feature_server.py211-666) exposes HTTP-only REST endpoints. No gRPC interface is provided (see Go/Java feature servers for gRPC).
Endpoint Summary
| Method | Path | Handler | Description |
|---|---|---|---|
| POST | /get-online-features | get_online_features | Retrieve features from online store |
| POST | /retrieve-online-documents | retrieve_online_documents | Vector similarity search (Alpha) |
| POST | /push | push | Push data to online/offline store |
| POST | /write-to-online-store | write_to_online_store | Legacy write endpoint |
| POST | /materialize | materialize | Materialize offline→online |
| POST | /materialize-incremental | materialize_incremental | Incremental materialization |
| GET | /health | health | Health check (200 when registry loaded) |
| POST | /chat | chat | Experimental chat endpoint |
| GET | /chat | chat_ui | Serves chat HTML UI |
| WS | /ws/chat | websocket_endpoint | WebSocket chat endpoint |
Sources: sdk/python/feast/feature_server.py323-661
POST /get-online-features
Retrieves features from the online store for real-time inference.
Request Model (sdk/python/feast/feature_server.py106-110):
Example request:
Response: OnlineResponse proto serialized via MessageToDict():
Implementation: Handler (sdk/python/feast/feature_server.py323-351) checks async_supported.online.read to determine whether to call get_online_features_async() or use run_in_threadpool().
Sources: sdk/python/feast/feature_server.py323-351 sdk/python/feast/feature_server.py106-110
POST /retrieve-online-documents
Vector similarity search endpoint (Alpha). Handler (sdk/python/feast/feature_server.py353-386) calls retrieve_online_documents() or retrieve_online_documents_v2() based on api_version.
Request Model (sdk/python/feast/feature_server.py113-120):
Sources: sdk/python/feast/feature_server.py353-386 sdk/python/feast/feature_server.py113-120
POST /push
Writes feature values to online and/or offline store for features backed by a PushSource.
Request Model (sdk/python/feast/feature_server.py84-89):
Offline Write Batching:
When enabled (sdk/python/feast/feature_server.py251-285), offline writes are queued in OfflineWriteBatcher (sdk/python/feast/feature_server.py814-938). The endpoint returns:
- HTTP 202 (Accepted) for batched offline writes
- HTTP 200 (OK) for immediate writes
Configuration:
Sources: sdk/python/feast/feature_server.py388-480 sdk/python/feast/feature_server.py84-89 sdk/python/feast/feature_server.py251-285
POST /write-to-online-store
Legacy endpoint for direct online store writes. Handler (sdk/python/feast/feature_server.py492-505).
Request Model (sdk/python/feast/feature_server.py77-81):
Sources: sdk/python/feast/feature_server.py492-505 sdk/python/feast/feature_server.py77-81
POST /materialize
Triggers materialization from offline store to online store. Handler (sdk/python/feast/feature_server.py530-561).
Request Model (sdk/python/feast/feature_server.py92-97):
Behavior:
- If
disable_event_timestamp=True: Materializes all data using current timestamp - Otherwise: Requires
start_tsandend_ts
Sources: sdk/python/feast/feature_server.py530-561 sdk/python/feast/feature_server.py92-97
POST /materialize-incremental
Incremental materialization since last materialization. Handler (sdk/python/feast/feature_server.py563-576).
Request Model (sdk/python/feast/feature_server.py100-103):
Sources: sdk/python/feast/feature_server.py563-576 sdk/python/feast/feature_server.py100-103
GET /health
Health check endpoint. Returns HTTP 200 if registry loaded, HTTP 503 otherwise. Handler (sdk/python/feast/feature_server.py507-513) checks registry_proto variable.
Sources: sdk/python/feast/feature_server.py507-513
POST /chat
Experimental chat interface. Handler (sdk/python/feast/feature_server.py515-519).
Request Model (sdk/python/feast/feature_server.py123-125):
Sources: sdk/python/feast/feature_server.py515-519 sdk/python/feast/feature_server.py123-125
GET /chat
Serves HTML chat UI from static/chat directory. Handler (sdk/python/feast/feature_server.py521-528).
Sources: sdk/python/feast/feature_server.py521-528
WebSocket /ws/chat
WebSocket chat endpoint with rate limiting. Handler (sdk/python/feast/feature_server.py616-656).
Limits:
- Max connections: 5
- Max message size: 4096 bytes
- Max messages/minute: 60
- Read timeout: 60 seconds
Uses ConnectionManager (sdk/python/feast/feature_server.py595-608) for connection tracking.
Sources: sdk/python/feast/feature_server.py595-656 sdk/python/feast/feature_server.py595-608
MCP (Model Context Protocol) Support
Optional MCP support for AI agent integration.
Configuration:
The _add_mcp_support_if_enabled() function (sdk/python/feast/feature_server.py669-691) adds MCP endpoints during initialization. Implementation: feast.infra.mcp_servers.mcp_server.add_mcp_support_to_app(). Failures are non-fatal.
Sources: sdk/python/feast/feature_server.py669-691
Feature Retrieval Implementation
Online Feature Retrieval Path
The feature server delegates to FeatureStore.get_online_features() which orchestrates the retrieval:
get_online_features() Internal Data Flow
Sources: sdk/python/feast/feature_server.py323-351 sdk/python/feast/infra/passthrough_provider.py239-258
Registry Refresh Mechanism
The feature server uses a background timer for automatic registry refresh.
Refresh Implementation:
The async_refresh() closure (sdk/python/feast/feature_server.py293-304) calls store.refresh_registry() and caches the result as registry_proto. A threading.Timer re-schedules itself after each refresh. Timer lifecycle:
- Start: During
lifespanstartup (sdk/python/feast/feature_server.py312) - Stop: During shutdown (sdk/python/feast/feature_server.py316)
Configuration:
| Parameter | Default | Description |
|---|---|---|
--registry_ttl_sec | 5 | Feature server refresh interval |
RegistryConfig.cache_ttl_seconds | 600 | In-memory cache TTL |
RegistryConfig.cache_mode | "sync" | "sync" or "thread" |
Default TTL defined by DEFAULT_FEATURE_SERVER_REGISTRY_TTL (sdk/python/feast/constants.py48).
Behavior:
/healthreturns HTTP 503 until first refresh completes- Lower
registry_ttl_sec= fresher schema, higher registry load
Sources: sdk/python/feast/feature_server.py293-319 sdk/python/feast/constants.py48 sdk/python/feast/repo_config.py136-184
Load small ML models, lookup tables, or embeddings at server startup.
Usage: Create static_artifacts.py in your feature repository:
Access in on-demand feature views:
Artifacts loaded during lifespan startup (sdk/python/feast/feature_server.py306-319) before store.initialize(). Implementation: load_static_artifacts() (sdk/python/feast/feature_server.py165-208).
Note: For large models, use dedicated serving (vLLM, TGI).
Sources: sdk/python/feast/feature_server.py165-208 sdk/python/feast/feature_server.py306-319
Deployment
Local Development
Sources: docs/getting-started/quickstart.md106-118
Kubernetes Deployment
Deployment Architecture
Characteristics:
- Stateless pods → horizontal scaling
- Health endpoints → liveness/readiness probes
- ConfigMap →
feature_store.yaml - Secrets → credentials
- Service mesh → traffic management
Sources: docs/how-to-guides/running-feast-in-production.md1-27 infra/feast-operator/README.md1-19
Production Configuration Example
The registry_type: sql triggers SQL registry initialization via SqlRegistry (sdk/python/feast/infra/registry/sql.py). The path must follow SQLAlchemy connection string format. Note the validation in RegistryConfig.validate_path() that auto-corrects postgresql:// to postgresql+psycopg://.
Sources: sdk/python/feast/repo_config.py136-184 docs/how-to-guides/running-feast-in-production.md33-49
Integration with FeatureStore Class
The feature server is a thin HTTP/gRPC wrapper around the FeatureStore class (sdk/python/feast/feature_store.py105-220). Each API request maps to a FeatureStore method call.
Request to Method Mapping
| HTTP Endpoint | FeatureStore Method |
|---|---|
POST /get-online-features | get_online_features() / get_online_features_async() |
POST /retrieve-online-documents | retrieve_online_documents() / retrieve_online_documents_v2() |
POST /push | push() / push_async() |
POST /write-to-online-store | write_to_online_store() |
POST /materialize | materialize() |
POST /materialize-incremental | materialize_incremental() |
Key FeatureStore Methods
get_online_features() (sdk/python/feast/feature_store.py1500-1652):
materialize() (sdk/python/feast/feature_store.py1900-2034):
The disable_event_timestamp parameter allows materialization without event timestamps, using the current datetime as the event timestamp. This is useful when source data lacks proper timestamp columns.
materialize_incremental() (sdk/python/feast/feature_store.py2036-2133):
The feature server maintains a shared FeatureStore instance across requests:
- Registry: Lazy-loaded and cached with thread-safe access via property decorator
- Provider: Stateless
PassthroughProviderdelegates to online/offline stores - Online Store: Implementations must handle concurrent reads (most do via connection pooling)
- Configuration:
RepoConfigis immutable after initialization
The FeatureStore itself maintains minimal mutable state - the _feature_service_cache dict (sdk/python/feast/feature_store.py122) which is cleared on registry refresh.
Sources: sdk/python/feast/feature_server.py323-576 sdk/python/feast/feature_store.py105-220
Feature Server vs Direct SDK
| Aspect | Python Feature Server | Direct SDK |
|---|---|---|
| Protocol | HTTP REST | In-process |
| Languages | Any | Python only |
| Deployment | Centralized service | Embedded |
| Latency | +1-2ms overhead | Lowest |
| Scaling | Horizontal | Per-app |
| Registry cache | Shared | Per-instance |
| Use case | Microservices, polyglot | Python ML services |
Sources: docs/getting-started/quickstart.md38 sdk/python/feast/feature_store.py124-169
Performance
Optimization Strategies
Worker Configuration:
- Production:
--workers -1(auto: 2×CPU+1) - High concurrency:
--worker-connections 2000 - Memory leak prevention:
--max-requests 1000
Registry Caching:
--registry_ttl_sec 60for production (balance freshness vs load)RegistryConfig.cache_ttl_seconds: 600for in-memory cacheRegistryConfig.cache_mode: "thread"for background refresh
Async Support:
The server automatically uses async methods when async_supported.online.read is true (sdk/python/feast/feature_server.py337-342):
Prometheus Metrics:
When --metrics is enabled, Prometheus server starts on port 8000 (sdk/python/feast/feature_server.py752-760):
| Metric | Type | Description |
|---|---|---|
feast_feature_server_cpu_usage | Gauge | CPU % |
feast_feature_server_memory_usage | Gauge | Memory % |
Updated by monitor_resources() thread (sdk/python/feast/feature_server.py717-730) every 5 seconds.
Sources: sdk/python/feast/feature_server.py733-805 sdk/python/feast/feature_server.py717-730 sdk/python/feast/feature_server.py337-342
Registry Cache Management
The registry cache significantly impacts latency. Key considerations:
--registry_ttl_sec: The feature server's own refresh interval (default: 5s viaDEFAULT_FEATURE_SERVER_REGISTRY_TTL). This is separate from the registry's internalcache_ttl_seconds.RegistryConfig.cache_ttl_seconds(default: 600s): The in-memory cache TTL within theCachingRegistrylayer.RegistryConfig.cache_mode:"sync": Immediate consistency, higher latency on writes"thread": Background refresh, lower latency
The feature server's timer-based refresh (sdk/python/feast/feature_server.py293-304) runs independently of the registry's own cache mode.
Sources: sdk/python/feast/feature_server.py293-304 sdk/python/feast/constants.py48 sdk/python/feast/repo_config.py136-161
Online Store Latency:
| Store | Latency | Notes |
|---|---|---|
| SQLite | 1–10ms | Local only |
| Redis | 1–5ms | Horizontal scaling |
| DynamoDB | 5–20ms | Managed, async |
| PostgreSQL | 2–10ms | Async support |
Sources: sdk/python/feast/repo_config.py68-89
Security
Authentication
Initialized in start_server() (sdk/python/feast/feature_server.py763-772):
Auth types (auth_config.type):
| Type | Description |
|---|---|
no_auth | Default |
kubernetes | Service account tokens |
oidc | OAuth 2.0 / OpenID Connect |
Endpoints use inject_user_details dependency (sdk/python/feast/feature_server.py325) and assert_permissions() for RBAC.
Network Security:
- TLS:
--key/--certflags (sdk/python/feast/feature_server.py748-751) - API gateway or service mesh
- Network policies for online store access
Sources: sdk/python/feast/feature_server.py763-772 sdk/python/feast/feature_server.py748-805
Monitoring
Metrics
Prometheus metrics on port 8000 when --metrics enabled (sdk/python/feast/feature_server.py752-760):
| Metric | Type | Description |
|---|---|---|
feast_feature_server_cpu_usage | Gauge | CPU % |
feast_feature_server_memory_usage | Gauge | Memory % |
Updated by monitor_resources() (sdk/python/feast/feature_server.py717-730) every 5 seconds via psutil.
Logging
FastAPI logger tracks:
- Registry refreshes (sdk/python/feast/feature_server.py293-304)
- Batch flush events (sdk/python/feast/feature_server.py814-938)
- Static artifact loads (sdk/python/feast/feature_server.py206-208)
- MCP initialization (sdk/python/feast/feature_server.py683-690)
Global exception handler (sdk/python/feast/feature_server.py578-592) logs tracebacks and returns structured JSON errors via FeastError.to_error_detail().
Sources: sdk/python/feast/feature_server.py68-73 sdk/python/feast/feature_server.py578-592 sdk/python/feast/feature_server.py717-730
Troubleshooting
Common Issues
High Latency:
- Check online store latency and connection pool settings
- Verify registry cache TTL is not too low
- Ensure network connectivity is optimal
Feature Not Found Errors:
- Verify
feast applywas run to register features - Check registry cache TTL and refresh
- Confirm feature view names match request
Authentication Failures:
- Verify auth configuration matches client settings
- Check OIDC token validity and expiration
- Ensure service accounts have proper permissions
Sources: sdk/python/feast/errors.py17-70 sdk/python/feast/feature_store.py208-223
Refresh this wiki
Enter email to refreshOn this page
- Python Feature Server
- Overview
- Architecture
- Component Architecture
- Request Flow
- Process Initialization
- Configuration
- Feature Server Configuration
- Offline Write Batching
- Key Configuration Fields
- Example Configurations
- Starting the Server
- CLI Command
- REST API Endpoints
- Endpoint Summary
- POST /get-online-features
- POST /retrieve-online-documents
- POST /push
- POST /write-to-online-store
- POST /materialize
- POST /materialize-incremental
- GET /health
- POST /chat
- GET /chat
- WebSocket /ws/chat
- MCP (Model Context Protocol) Support
- Feature Retrieval Implementation
- Online Feature Retrieval Path
- Registry Refresh Mechanism
- Static Artifacts Loading (Alpha)
- Deployment
- Local Development
- Kubernetes Deployment
- Production Configuration Example
- Integration with FeatureStore Class
- Request to Method Mapping
- Key FeatureStore Methods
- Thread Safety
- Feature Server vs Direct SDK
- Performance
- Optimization Strategies
- Registry Cache Management
- Security
- Authentication
- Monitoring
- Metrics
- Logging
- Troubleshooting
- Common Issues