Real-world production projects.
Not toy tutorials.
Every project in this catalog is designed as a deployable, interview-ready system. Ship it, write it up, talk through it in the hiring loop.
The 3 projects that get you hired.
Flink fraud detection
Stateful streaming pipeline on Flink + Kafka — 5 keyed-state detectors, exactly-once via 2PC, Flink K8s Operator with ZK HA.
Iceberg Lakehouse Foundations
Local ACID lakehouse on Iceberg + Nessie + MinIO. Bronze → Silver → Gold + maintenance.
Enterprise RAG — retrieval-quality build
EXPERT-tier retrieval-quality RAG: 4-strategy chunking A/B (62/78/85%), hybrid BM25 + dense + RRF, cross-encoder reranker, RAGAS 4-metric canary, LLM gateway with fallback. 5 ADRs + cost-model CSV bundled.
Foundations
Commerce data warehouse
Kimball star schema with 22 dbt models — atomic + event facts, SCD2 via snapshots, incremental processing, and a GitHub Actions Slim CI gate.
Ecommerce analytics modeling layer
Free 17-model dbt analytics modeling layer for ShopCo — star schema with documented grain, incremental + SCD2, RFM-scored LTV, and a dbt Cloud + Slim CI production spine.
Batch pipelines
Iceberg Lakehouse Foundations
Local ACID lakehouse on Iceberg + Nessie + MinIO. Bronze → Silver → Gold + maintenance.
ShopStream Spark Batch Pipeline
Spark + Delta Lake batch ETL with 4 documented optimization patterns (9x progression), ACID lakehouse, and a Kafka/K8s streaming overlay.
Airflow + dbt: production pipeline foundations
3 production DAGs: REST-API ingestion + idempotent UPSERT, multi-source orchestration with TaskGroups + dynamic mapping, dbt medallion with quality-gated tests. Local Docker.
IceLake Commerce — end-to-end Iceberg tour
Breadth tour: foundations, multi-engine, Debezium+Kafka+Flink CDC, Feast on Iceberg in 13h.
Streaming
Flink fraud detection
Stateful streaming pipeline on Flink + Kafka — 5 keyed-state detectors, exactly-once via 2PC, Flink K8s Operator with ZK HA.
Real-time fraud detection on Kafka Streams
Stateful Kafka Streams topology with KTable enrichment, EOS v2, and Strimzi K8s manifests.
Schema evolution & data contracts
FastAPI schema registry, dbt + Great Expectations enforcement, GitHub Actions PR gate, and column-level lineage with NetworkX + OpenLineage.
Real-time fraud feature store
Feast + Kafka + Spark Streaming spine for a fraud model: 22 features, p99 < 10ms, Schema Registry + Avro, Helm/K8s.
Data quality
Data observability stack
Detect, trace, prevent: dbt + OpenLineage + Grafana on a pre-broken warehouse.
Data governance & contracts
ODCS contracts, GE + Soda validation, Avro + Schema Registry PR gate, 4-tier PII + RBAC + hashed audit, SOC2 + GDPR engines.
AI & vectors
Enterprise RAG — retrieval-quality build
EXPERT-tier retrieval-quality RAG: 4-strategy chunking A/B (62/78/85%), hybrid BM25 + dense + RRF, cross-encoder reranker, RAGAS 4-metric canary, LLM gateway with fallback. 5 ADRs + cost-model CSV bundled.
PredictFlow — production MLOps platform with Feast + BentoML
EXPERT-tier MLOps build: MLflow + DVC + Feast (offline/online), BentoML on K8s with HPA + canary, Evidently drift + cron-gated retrain, Prometheus + Grafana. Modules 01-02 with PRO; full platform with EXPERT.
LLM evaluation framework — multi-judge cascade + recall@k gate
EXPERT-tier eval build: 3-judge cascade (Haiku → Sonnet → GPT-4o), variance-based agreement, recall@k regression gate in GitHub Actions, RAGAS scaffolding, online drift detection, 5 committed ADRs (one Deprecated), runnable cost-model CSV. 7 modules · 17-19h. Module 01 with PRO.
AI cost optimization (CostGuard)
Cost-aware LLM platform: token tracking, dual-tier cache, 4-strategy router, three-tier budget governance. 5 ADRs + cost-model CSV bundled.
Agentic data pipeline — LangGraph supervisor + HITL + ADRs
EXPERT-tier agent platform: LangGraph supervisor + 4 worker agents, RBAC tool registry, Redis checkpointing + 24h time-travel, HITL via interrupt_before + Slack actionable buttons, FailureDetector + ToolCallGuard, multi-tenant platform-design capstone, 5 committed ADRs (one Deprecated), runnable cost-model CSV. 6 modules · 17-18h. Modules 01-03 with PRO.
AI retrieval platform — pgvector + hybrid + RRF + cross-encoder
EXPERT-tier retrieval build: pgvector + HNSW, BM25 + RRF, cross-encoder reranker, OpenAI function-calling agent, semantic cache, drift detection, multi-region replication code. Modules 01-02 with PRO; full platform with EXPERT.
AI serving platform — vLLM + Ray Serve under SLA
EXPERT-tier inference build: vLLM continuous batching + PagedAttention, Ray Serve autoscale (market-hours min=2), Redis semantic cache (35% hit), ServingCircuitBreaker, 5 chaos scenarios + runbook, runnable cost-model CSV with break-even-vs-OpenAI math. Module 01 with PRO.
LLM training-data pipeline — crawl + dedup + RAG + LLMOps
EXPERT-tier dataset-engineering build: aiohttp crawler + MinHash/LSH dedup + quality scoring + tokenizer + pgvector/Pinecone RAG + vLLM serving + Airflow DAGs + Locust load tests + CI eval gate. Modules 01-02 with PRO; full platform with EXPERT.
Enterprise AI platform — multi-tenant governed RAG
EXPERT-tier governance build: pgvector + Postgres RLS, Presidio + jailbreak guardrails, lineage + policy in Redis, per-tenant cost tracker, OTel + Prometheus. 4 modules · 20-26h. Module 01 with PRO.
Platform
CI/CD data platform
Terraform + GitHub Actions + dbt CI — the platform under the platform.
DataGuard reliability
SRE for the data platform: dependency graphs, dep-aware SLOs, chaos engineering, multi-team incident management.
Cloud cost optimization
Cut a $300K Snowflake bill 60% — forensics, right-size, compact, govern.
Multi-source ingestion service
REST + webhook + S3 + SaaS through one Airflow DAG, with backoff, dedup, and schema gates.
Staff Engineering
Uber Event Platform: Staff Design Portfolio
Staff-level system-design portfolio: redesign Uber's event platform, 10K → 1B events/day. 69 artifacts, no code.
Full-stack AI platform — full RAG system + production hardening
EXPERT-tier full-stack RAG: pgvector + HNSW + hybrid retrieval + RRF + cross-encoder rerank, 4-class query router with confidence threshold, 3-level failure cascade (RAG → LLM-only → cached), per-tenant index isolation, eval gates, cost guardrails, 6-mode incident simulator, 5 committed ADRs (one Deprecated), runnable cost-model CSV. 6 modules · 20-22h. Modules 01-03 with PRO.
Multi-cloud platform foundation
Terraform IaC for AWS + GCP: dev/staging/prod, 5-role RBAC, KMS secrets, AWS Budgets + Grafana FinOps.
Experimentation platform on dbt + scipy
Welch + MDE + SRM + scorecard. 26+ dbt models, scipy stats, FastAPI + Redis flag service, lifecycle state machine.
Staff+ leadership playbook
RFC → ADR → architecture review → blameless postmortem. The four artifacts a promo committee actually reads.
Can’t decide which one to start with?
Take the 2-minute skill assessment. We’ll match you to the project that fits your level and the career you’re aiming for.