Skip to content
Back to PredictFlow Feature Store

Online store is Redis, not DynamoDB or Postgres direct

✓ AcceptedPredictFlow Feature Store02 — Feature Store & Feature Registry
By AI-DE Engineering Team·Stakeholders: ML platform engineer, data infra lead, security reviewer

Context

The online store serves features at inference time, on the request path. Latency budget for BentoML.predict() is <50 ms P99 end-to-end across the gateway, the feature lookup, the model forward pass, and the output serialization. The feature lookup itself must be <10 ms P99 to leave room for model compute. Feast supports Redis, DynamoDB, Datastore, SQLite, Bigtable, Postgres, and Cassandra as online-store backends; the practical choices for a small-team production deploy are:

  1. Redis (ElastiCache) — in-memory KV, sub-ms gets, well-understood ops.
  2. DynamoDB — managed, autoscaling, single-digit-ms gets.
  3. Postgres direct — reuse the registry/offline-store database.

Decision

Adopt Redis (ElastiCache for production, local Docker for the tutorial).

# feature_store.yaml
online_store:
  type: redis
  connection_string: localhost:6379 # ElastiCache primary endpoint in prod
# Online lookup (Module 02)
features = store.get_online_features(
    features=["churn_features:tenure_months", ...],
    entity_rows=[{"customer_id": 12345}],
).to_dict()
# P99 < 5ms in benchmarks shown in part-2

Tradeoffs we accept

LeverRedis (chosen)DynamoDBPostgres direct
Read latency (P99)<5 ms<10 ms15-30 ms
Write latency<2 ms<10 ms20-50 ms
Throughput ceiling~100k ops/sec/nodeAutoscaling~5k ops/sec
DurabilityRDB + AOF, replicaNativeWAL
Cost at 4M chunks × 4 tenants~$54/mo (cache.t4g.small × 2)Per-RCU pricing, hard to budget$0 marginal (reuse RDS)
Vendor lock-inNoneAWS-onlyNone
Operational complexityStandardZero (managed)None new
Local-dev paritySingle Docker containerLocalStack workaroundDev DB

We optimize for read latency + vendor independence + local-dev parity. Redis wins on raw latency by 2-3× over DynamoDB. Postgres direct is enough latency to blow our P99 budget once you stack network

  • model compute on top.

Consequences (positive)

  • P99 feature-lookup latency consistently <5 ms in part-3's load tests.
  • Local development uses the same redis:7-alpine Docker image as the ElastiCache primary in prod — no behavioral surprises.
  • Redis pubsub is available for real-time feature freshness signals (used in Module 04's drift detection).
  • Reset is one FLUSHALL — fast iteration during Module 02 hacking.
  • Cost is bounded: a cache.t4g.small primary + replica handles the 4-tenant load with 50% headroom (see cost-model CSV).

Consequences (negative)

  • Memory cap. A Redis instance can hold ~1-2 GB of features per GB of RAM. At 32M chunks × 1 KB average = ~32 GB working set, which needs cache.t4g.medium (3.09 GB) at minimum. Mitigation: per-tenant TTL eviction on long-tail features.
  • Snapshotting cost. AOF + RDB on a hot Redis can cost 10-15% CPU. Mitigation: replica handles snapshots; primary stays clean.
  • No native time-travel. Redis is current-state; PIT correctness for training uses the offline store (Parquet on S3), not Redis. This is by design — see src/training_features.py for the offline PIT join.
  • Cache miss = stale prediction. If Redis is down, the BentoML service serves stale features (or 503s). Mitigation: liveness probe in k8s/deployment.yaml checks Redis reachability; fallback path is documented in the runbook.

Reversal plan

feature_store.yaml is the single config knob. To swap online stores:

  1. Update online_store.type in YAML.
  2. Re-run feast materialize to populate the new store.
  3. Switch ElastiCache → DynamoDB / Postgres in the deployment manifests.
  4. Re-run Module 03 integration tests (latency assertions in tests/integration/test_api.py will fail if the swap blows the <50 ms P99 budget — fail-loud, not fail-silent).

Estimated effort: 2-4 engineer-days for a tested swap. Reversible.

References

  • feature_store.yaml
  • feature_store/sync.py (Kafka → Redis materialization)
  • test_online_features.py (latency benchmark)
  • ADR-001 (Feast — choice of orchestration layer is independent of online store)
Built into the project

This decision shipped as part of PredictFlow Feature Store — see the full architecture, starter kit, and 4 more ADRs.

Open project →
Press Cmd+K to open