Design
Uber's event platform
— from 10K to 1B events/day
A four-part guided redesign: stakeholder requirements → six-layer architecture → tiered SLAs + cost model → observability + DR + mock staff review. You ship a 69-artifact design portfolio you can carry into staff-level system-design rounds.
This is the system-design question asked at Uber, DoorDash, Lyft, Stripe and any company routing 100M+ events/day.
- 13 requirements (FR-001..005, NFR-001..008) decomposed from 4 stakeholder interviews
- Capacity model from 10K → 1B events/day with a $100/mo → $34K/mo per-component breakdown
- Six-layer architecture blueprint — Kafka topics, Iceberg tiers, Flink, Redis, Trino, observability
- Schema Registry contracts (Avro + Protobuf) and a hot/warm/cold storage tiering YAML
- Five-pillar observability with SLO + burn-rate alert rules and a DR plan (RTO 5 min, RPO 0)
- An INDEX of ~30 staff-interview questions mapped to specific portfolio artifacts
Senior+ system-design rounds got harder.
Staff-track loops at the streaming-heavy companies now expect capacity math, SLA tiering, cost reasoning, and erasure design — not just a whiteboard sketch. This project ships the vocabulary you’re missing.
System design ≠ box drawing
Uber, DoorDash, Stripe, Confluent loops now ask for partition math, error-budget tradeoffs, and unit economics. Anyone can draw boxes; they pay for the reasoning underneath.
Streaming-first orgs run on event platforms
Uber processes 1T+ events/day; LinkedIn, Stripe, DoorDash all run on Kafka-shaped platforms. The shape of the question is the same — only the constants change.
Cost is the new tradeoff currency
‘Scalable’ is table stakes. The real conversation is unit economics — $/event, $/query, $/SLA-tier — and how you defend the next $20K/mo line item.
GDPR + erasure changed the storage stack
Right-to-be-forgotten at 1B events/day forces row-level deletes into the architecture from day one — Iceberg over Hive, BACKWARD-compatible schemas, lineage on by default.
Module 01 is free. The rest unlocks with PRO.
Try the first 2-3 hours — interview the 4 stakeholders, decompose 13 requirements, run the capacity math, score Lambda vs Kappa vs Lakehouse. If the rigor lands, upgrade for the ingestion, serving, and operations modules.
System Design for Data Engineers
This curriculum is the design vocabulary for the project — not a sales add-on. PRO subscribers get full access to every module.
Three sprints. Three checkpoints. One defended platform design.
Each phase ends with a tagged set of artifacts you can hand to a reviewer. No ambiguity about where you are in the redesign.
Stakeholder interviews complete, 13 requirements decomposed, capacity math done from 10K to 1B/day, pattern decision matrix scored, six-layer blueprint drawn.
- ✓Requirements doc (FR-001..005, NFR-001..008)
- ✓Capacity model with $/event at every phase
- ✓Architecture decision matrix (Lambda vs Kappa vs Lakehouse)
- ✓Six-layer blueprint with technology selections
Kafka topology designed, schema contracts written, storage tiered, three-pattern serving layer architected, cost model finalized at every scale.
- ✓10 named Kafka topics + partition strategy + CDC plan
- ✓Avro / Protobuf schema contracts with BACKWARD compatibility
- ✓Hot/warm/cold storage tiering YAML + GDPR erasure design
- ✓Three-pattern serving design (Redis · Trino · Spark+dbt)
- ✓Feature store entity-key schema + freshness config
- ✓$34K/mo cost model at 1B/day with per-component breakdown
Observability designed, SLOs and burn-rate alerts defined, DR plan written, mock staff review presented and defended. INDEX of interview questions complete.
- ✓Five-pillar observability design
- ✓SLO + error-budget framework + alert rules YAML
- ✓DR plan (RTO 5 min, RPO 0) + incident runbook
- ✓Mock staff-review presentation with anticipated Q&A
- ✓INDEX of ~30 interview questions → 69 artifacts
Download your 69-artifact design portfolio.
There’s nothing to install — this is whiteboard-only. Grab the case-study bundle and open INDEX.md to start the 6-step interview rehearsal.
What lives in the bundle
Every artifact you’ll ship across the four parts, organized by part folder, with an INDEX mapping ~30 staff-level interview questions to the specific files that answer them.
- part-1/ — 13 artifacts: requirements, capacity model, decision matrix
- part-2/ — 18 artifacts: topic strategy, schema contracts, storage tiering
- part-3/ — 20 artifacts: serving design, feature store, cost model
- part-4/ — 18 artifacts: SLO framework, alert rules, DR plan
- INDEX.md — ~30 interview questions → artifact map
- interview-walkthrough.md — 6-step, 60-min staff-level rehearsal
Uber Event Platform: Design Portfolio
69 artifacts, 11 YAML configs, INDEX of 30 interview questions, 6-step interview walkthrough. Reference material for staff-level system-design rounds.
The same blueprint — built for the real failure modes.
Most Kafka system-design write-ups stop at the happy path. The table below pairs every simplification we made on the whiteboard with what a real implementation would actually need — the answers a staff principal will press you on.
FULL on Tier-1, per-subject compatibility-mode enforced in CIMirrorMaker2 to passive region + DR runbook with quarterly failover drillrewrite_data_filesOpenTelemetry distributed tracing for end-to-end event lineageReal review from staff principals who run event platforms.
Submit your portfolio bundle, get the kind of pushback you’d hear in an actual staff loop — partition math, error-budget tradeoffs, vendor decisions, cost defense.
4 design reviews / month
Submit your portfolio bundle, a single artifact, or a redesign proposal. Reviewer is matched to your domain — Kafka / Iceberg / observability for this project. Async, comments inline, average turnaround 31 hours.
2 mock staff interviews / month
Live 30-min sessions with a staff-level engineer. Defend your design against the questions you’ll actually hear: partition math, EOS guarantees, cost-vs-latency tradeoffs. Group sessions also available.
One subscription. 15+ projects, all curriculum, design review.
PRO is built for senior+ engineers who want production-grade builds and feedback loops — not more tutorials.
Pick this if you’re defending designs, not learning them.
Staff-track senior engineers
You’re prepping for the Uber / DoorDash / Stripe staff loop. You can ship a feature; what you need is the design vocabulary the system-design panel expects.
Tech leads driving streaming migration
You need to defend a Kafka platform redesign in front of leadership. Capacity math, SLA tiering, cost model — the parts you can’t afford to fudge.
Platform architects
You run streaming for 10+ teams. You want a reusable framework for event-platform decisions: topic taxonomy, partition counts, schema policies.
Senior engineers crossing batch → streaming
You know the warehouse cold; the streaming side feels like a different planet. This gives you the architecture grammar for routing 1B events/day.
Going deeper? Four tracks back this project.
System-design vocabulary is the spine. These four curriculums let you go deeper on the layers the project designs but doesn’t implement.
Quick answers.
Ready to architect a real event platform?
Start with module 01 — free, no card. Decompose the requirements, run the capacity math, score Lambda vs Kappa vs Lakehouse on a weighted matrix. About 2-3 hours.