# ADR-001 — Use Feast over Tecton / Hopsworks / DIY for the feature-store layer

- **Status:** Accepted
- **Date:** 2026-05-09
- **Module:** 02 — Feature Store & Feature Registry
- **Stakeholders:** ML platform engineer, data platform lead, MLOps reviewer

## Context

The platform must serve point-in-time-correct features for training and
sub-100 ms online lookups for inference, across multiple models and teams.
Training-serving skew is the failure mode every previous attempt at this
team has hit. Feature reuse across teams is a real ask. The classic
options:

1. **Tecton** — managed feature store, opinionated, best-in-class PIT correctness.
2. **Hopsworks** — open-source-with-managed feature store, broader scope (full ML platform).
3. **Feast** — open-source feature-store _orchestration layer_ (BYO offline + online + registry stores).
4. **DIY** — Postgres + Redis + a Python `FeatureView` class.

We are building a reference MLOps platform for tutorial purposes — the
choice has to be reproducible by a learner on a laptop in <15 minutes
and survive a real production deploy.

## Decision

Adopt **Feast** as the feature-store orchestration layer.

```yaml
# feature_store.yaml
project: predictflow
provider: local
registry: postgres://...
online_store:
  type: redis
  connection_string: localhost:6379
offline_store:
  type: file # parquet on local FS or S3
```

```python
# features/customer_features.py
from feast import FeatureView, Entity, Field
from feast.types import Float32, Int64

customer = Entity(name="customer_id", value_type=Int64)
churn_features = FeatureView(
    name="churn_features",
    entities=[customer],
    ttl=timedelta(days=30),
    schema=[
        Field(name="tenure_months", dtype=Float32),
        Field(name="monthly_charges", dtype=Float32),
        ...
    ],
    source=parquet_source,
)
```

## Tradeoffs we accept

| Lever                    | Tecton               | Hopsworks            | DIY                | Feast (chosen)                        |
| ------------------------ | -------------------- | -------------------- | ------------------ | ------------------------------------- |
| Day-1 setup              | Vendor onboarding    | Self-host or managed | Hours of glue code | `pip install feast` + `feast init`    |
| PIT correctness          | Native               | Native               | We build it        | Native (`get_historical_features`)    |
| Online latency (P99)     | <5 ms                | <10 ms               | <10 ms (Redis)     | <10 ms (Redis online store)           |
| Cross-team feature reuse | Strong (registry UI) | Strong               | Build the registry | Real registry, no UI by default       |
| Vendor lock-in           | High                 | Medium               | None               | None                                  |
| Cost at <100 features    | $$$                  | $ infra              | $ infra            | $ infra                               |
| Streaming sync           | Native               | Native               | We build it        | Streaming engine API + Kafka examples |
| Tutorial reproducibility | Cannot (vendor)      | Heavy infra          | Build everything   | One YAML + Python decorators          |

We optimize for **tutorial reproducibility** + **production portability**.
Tecton wins on managed polish but a learner can't reproduce a managed
SaaS on their laptop. Hopsworks pulls in a full ML platform — more
than this project needs. DIY is what every team starts with and
regrets. Feast is the open-source-with-real-receipts middle.

## Consequences (positive)

- A learner runs `feast init` + edits two Python files and has a working
  feature store in <15 minutes (Module 02 ships in this time budget).
- Feature definitions are pure Python decorators — a swap to Tecton or
  Hopsworks is a dependency change, not a rewrite of every feature.
- Feast's PIT correctness is battle-tested — we don't reimplement the
  hardest bug class in feature engineering.
- The `feature_store.yaml` config is the single point of vendor swap —
  online-store, offline-store, and registry are independent dimensions.

## Consequences (negative)

- **No managed UI** for browsing the registry. We build a thin Python
  `FeatureRegistry` (`feature_store/registry.py`) for the tutorial; in
  production teams typically front it with a web UI or pull into Atlas.
- **Streaming sync is BYO orchestration.** Feast exposes
  `write_to_online_store()` but doesn't run a Kafka consumer for you.
  Module 02 ships the consumer in `feature_store/sync.py`.
- **No native lineage UI.** The `migrations/feature_registry.sql`
  schema includes a `feature_lineage` table; querying it is a SQL
  exercise, not a UI click.
- **Backfill orchestration is BYO.** `feature_store/backfill.py` shows
  the pattern; production teams would wrap it in Airflow / Argo.

## Reversal plan

The `FeatureRegistry` interface (`register_feature`, `get_features`,
`record_lineage`) is a thin Python class, and the feature definitions
are Python decorators. Swap is bounded:

1. Replace `feature_store.yaml` with the new tool's config.
2. Translate `FeatureView` decorators to the new tool's equivalent
   (Tecton uses similar `@stream_feature_view`; Hopsworks uses
   `fs.create_feature_group()`).
3. Translate the Kafka sync from `write_to_online_store()` to the
   new tool's online API.
4. Re-run the Module 03 integration suite to confirm online lookups.

Estimated effort: **2-3 engineer-weeks** for a tested swap. Reversible.

## References

- `feature_store.yaml`
- `features/{customer_features,behavioral_features}.py`
- `feature_store/{registry,sync,backfill}.py`
- `migrations/feature_registry.sql`
- ADR-002 (Redis online store choice — independent of Feast)
- ADR-003 (MLflow + DVC — tracking + data versioning, complements Feast)
