Skip to content
Back to PredictFlow Feature Store

Use Feast over Tecton / Hopsworks / DIY for the feature-store layer

✓ AcceptedPredictFlow Feature Store02 — Feature Store & Feature Registry
By AI-DE Engineering Team·Stakeholders: ML platform engineer, data platform lead, MLOps reviewer

Context

The platform must serve point-in-time-correct features for training and sub-100 ms online lookups for inference, across multiple models and teams. Training-serving skew is the failure mode every previous attempt at this team has hit. Feature reuse across teams is a real ask. The classic options:

  1. Tecton — managed feature store, opinionated, best-in-class PIT correctness.
  2. Hopsworks — open-source-with-managed feature store, broader scope (full ML platform).
  3. Feast — open-source feature-store orchestration layer (BYO offline + online + registry stores).
  4. DIY — Postgres + Redis + a Python FeatureView class.

We are building a reference MLOps platform for tutorial purposes — the choice has to be reproducible by a learner on a laptop in <15 minutes and survive a real production deploy.

Decision

Adopt Feast as the feature-store orchestration layer.

# feature_store.yaml
project: predictflow
provider: local
registry: postgres://...
online_store:
  type: redis
  connection_string: localhost:6379
offline_store:
  type: file # parquet on local FS or S3
# features/customer_features.py
from feast import FeatureView, Entity, Field
from feast.types import Float32, Int64

customer = Entity(name="customer_id", value_type=Int64)
churn_features = FeatureView(
    name="churn_features",
    entities=[customer],
    ttl=timedelta(days=30),
    schema=[
        Field(name="tenure_months", dtype=Float32),
        Field(name="monthly_charges", dtype=Float32),
        ...
    ],
    source=parquet_source,
)

Tradeoffs we accept

LeverTectonHopsworksDIYFeast (chosen)
Day-1 setupVendor onboardingSelf-host or managedHours of glue codepip install feast + feast init
PIT correctnessNativeNativeWe build itNative (get_historical_features)
Online latency (P99)<5 ms<10 ms<10 ms (Redis)<10 ms (Redis online store)
Cross-team feature reuseStrong (registry UI)StrongBuild the registryReal registry, no UI by default
Vendor lock-inHighMediumNoneNone
Cost at <100 features$$$$ infra$ infra$ infra
Streaming syncNativeNativeWe build itStreaming engine API + Kafka examples
Tutorial reproducibilityCannot (vendor)Heavy infraBuild everythingOne YAML + Python decorators

We optimize for tutorial reproducibility + production portability. Tecton wins on managed polish but a learner can't reproduce a managed SaaS on their laptop. Hopsworks pulls in a full ML platform — more than this project needs. DIY is what every team starts with and regrets. Feast is the open-source-with-real-receipts middle.

Consequences (positive)

  • A learner runs feast init + edits two Python files and has a working feature store in <15 minutes (Module 02 ships in this time budget).
  • Feature definitions are pure Python decorators — a swap to Tecton or Hopsworks is a dependency change, not a rewrite of every feature.
  • Feast's PIT correctness is battle-tested — we don't reimplement the hardest bug class in feature engineering.
  • The feature_store.yaml config is the single point of vendor swap — online-store, offline-store, and registry are independent dimensions.

Consequences (negative)

  • No managed UI for browsing the registry. We build a thin Python FeatureRegistry (feature_store/registry.py) for the tutorial; in production teams typically front it with a web UI or pull into Atlas.
  • Streaming sync is BYO orchestration. Feast exposes write_to_online_store() but doesn't run a Kafka consumer for you. Module 02 ships the consumer in feature_store/sync.py.
  • No native lineage UI. The migrations/feature_registry.sql schema includes a feature_lineage table; querying it is a SQL exercise, not a UI click.
  • Backfill orchestration is BYO. feature_store/backfill.py shows the pattern; production teams would wrap it in Airflow / Argo.

Reversal plan

The FeatureRegistry interface (register_feature, get_features, record_lineage) is a thin Python class, and the feature definitions are Python decorators. Swap is bounded:

  1. Replace feature_store.yaml with the new tool's config.
  2. Translate FeatureView decorators to the new tool's equivalent (Tecton uses similar @stream_feature_view; Hopsworks uses fs.create_feature_group()).
  3. Translate the Kafka sync from write_to_online_store() to the new tool's online API.
  4. Re-run the Module 03 integration suite to confirm online lookups.

Estimated effort: 2-3 engineer-weeks for a tested swap. Reversible.

References

  • feature_store.yaml
  • features/{customer_features,behavioral_features}.py
  • feature_store/{registry,sync,backfill}.py
  • migrations/feature_registry.sql
  • ADR-002 (Redis online store choice — independent of Feast)
  • ADR-003 (MLflow + DVC — tracking + data versioning, complements Feast)
Built into the project

This decision shipped as part of PredictFlow Feature Store — see the full architecture, starter kit, and 4 more ADRs.

Open project →
Press Cmd+K to open