ai-de.net/Projects/P02 · Uber Event Platform: Staff Design Portfolio

Last updated 2026-05-22By AI-DE Engineering Team

PRO · module 01 free previewStreaming trackP02

Design
Uber's event platform
— from 10K to 1B events/day

A four-part guided redesign: stakeholder requirements → six-layer architecture → tiered SLAs + cost model → observability + DR + mock staff review. You ship a 69-artifact design portfolio you can carry into staff-level system-design rounds.

Timeline

~12h reading · 8-12h self-design

Difficulty

Staff-level

Format

Whiteboard · 69 artifacts

See PRO benefits

This is the system-design question asked at Uber, DoorDash, Lyft, Stripe and any company routing 100M+ events/day.

By the end you will have

13 requirements (FR-001..005, NFR-001..008) decomposed from 4 stakeholder interviews
Capacity model from 10K → 1B events/day with a $100/mo → $34K/mo per-component breakdown
Six-layer architecture blueprint — Kafka topics, Iceberg tiers, Flink, Redis, Trino, observability
Schema Registry contracts (Avro + Protobuf) and a hot/warm/cold storage tiering YAML
Five-pillar observability with SLO + burn-rate alert rules and a DR plan (RTO 5 min, RPO 0)
An INDEX of ~30 staff-interview questions mapped to specific portfolio artifacts

PREREQBuilt for senior+ engineers prepping for the staff loop. Comfortable with event-driven architecture, Kafka basics, and one warehouse or lakehouse engine. Not a tutorial — assumes you’ve shipped data platforms before.

uber.event.platform · 6 layers · whiteboard design

1B/day target

Sources

Ingestion

Processing

Storage · serve

rider.app

driver.app

payments

maps.api

4 domains · 1B/d

rider.events.raw

driver.events.raw

maps.location.raw

payments.tx

10 topics · 100 partitions

flink-streamEOS · windowing

flink-enrichjoins · broadcast

spark-batchmedallion fill

dbt-transformgold metrics

EOS · windowing

gold.metricsIceberg · GDPR · Glacier

silver.eventsdedup · S3 IA

redis · MLp99 < 100ms

trino · OLAPp95 < 5s

dbt · metrics< 30 min

5 PB · 3 serving patterns

# Scale path · 100,000×

10K events/d → 1M → 100M → 1B/d

$100/mo → $1.5K → $8K → $34K/mo

27 brokers · 600 partitions · 5 PB Iceberg

→ unit economics defended at every phase

● Tiered SLA · 3 levels

T1: 99.99% · T2: 99.95% · T3: 99.9%

burn-rate alerts: 14.4× · 6× · 1×

RTO 5 min · RPO 0 · GDPR < 72h

→ ~$200K/yr saved by not over-provisioning

1B/d

Peak event target

Architecture layers

99.99%

Tier-1 SLA

Why this matters in 2026

Senior+ system-design rounds got harder.

Staff-track loops at the streaming-heavy companies now expect capacity math, SLA tiering, cost reasoning, and erasure design — not just a whiteboard sketch. This project ships the vocabulary you’re missing.

System design ≠ box drawing

Uber, DoorDash, Stripe, Confluent loops now ask for partition math, error-budget tradeoffs, and unit economics. Anyone can draw boxes; they pay for the reasoning underneath.

Streaming-first orgs run on event platforms

Uber processes 1T+ events/day; LinkedIn, Stripe, DoorDash all run on Kafka-shaped platforms. The shape of the question is the same — only the constants change.

Cost is the new tradeoff currency

‘Scalable’ is table stakes. The real conversation is unit economics — $/event, $/query, $/SLA-tier — and how you defend the next $20K/mo line item.

GDPR + erasure changed the storage stack

Right-to-be-forgotten at 1B events/day forces row-level deletes into the architecture from day one — Iceberg over Hive, BACKWARD-compatible schemas, lineage on by default.

Curriculum · 4 modules · ~12 hours reading

Module 01 is free. The rest unlocks with PRO.

Try the first 2-3 hours — interview the 4 stakeholders, decompose 13 requirements, run the capacity math, score Lambda vs Kappa vs Lakehouse. If the rigor lands, upgrade for the ingestion, serving, and operations modules.

P02 · ~12 hours · 4 modules

Free preview PRO required

Module 01 is free — no card required. Get a feel for the rigor before paying.

M01

✓Requirements & architecture blueprint

Interview 4 stakeholders (VP Data, Head of ML, Head of Analytics, Compliance). Decompose 13 requirements (FR-001..005, NFR-001..008). Run capacity estimation from 10K to 1B events/day. Score Lambda vs Kappa vs Lakehouse on a weighted matrix. Ship a six-layer blueprint with technology selections.

~3h13 lessonsFREE PREVIEW

Start →

M02

⊘Ingestion & storage architecture

Design 10 Kafka topics with naming convention + partition strategy. Pick CDC patterns for MySQL/Postgres/DynamoDB/HTTP sources. Write Schema Registry contracts with BACKWARD compatibility. Tier storage hot/warm/cold with lifecycle policies. Bake GDPR erasure into the design via Iceberg row-level deletes.

~3h18 lessonsPRO TIER

Unlock with PRO →

M03

⊘Serving layer & scale engineering

Architect three serving patterns: real-time (<100ms via Redis feature store), interactive (<5s via Trino), batch (<30min via Spark + dbt). Design the entity-key schema for ML features. Model the cost from 100K to 1B events/day with per-component breakdown ($34K/mo at 1B/day).

~3h20 lessonsPRO TIER

Unlock with PRO →

M04

⊘Production operations & defense

Design the five-pillar observability stack (freshness, volume, schema, distribution, lineage). Define 3 SLOs with burn-rate alerts (14.4x / 6x / 1x). Write the DR plan, incident runbook, and tiered alert matrix. Defend the whole design in a mock staff-level review with anticipated Q&A.

~3h18 lessonsPRO TIER

Unlock with PRO →

3 modules locked · Unlock all PRO content for $29/mo

Upgrade to PRO →

Backed by curriculum

System Design for Data Engineers

10 modules·16 hours·capacity estimation·SLA tiering·decision matrices·six-layer architecture·cost modeling

Open curriculum→

This curriculum is the design vocabulary for the project — not a sales add-on. PRO subscribers get full access to every module.

The design, in 3 checkpoints

Three sprints. Three checkpoints. One defended platform design.

Each phase ends with a tagged set of artifacts you can hand to a reviewer. No ambiguity about where you are in the redesign.

01~3h

Requirements & architecture

Stakeholder interviews complete, 13 requirements decomposed, capacity math done from 10K to 1B/day, pattern decision matrix scored, six-layer blueprint drawn.

✓Requirements doc (FR-001..005, NFR-001..008)
✓Capacity model with $/event at every phase
✓Architecture decision matrix (Lambda vs Kappa vs Lakehouse)
✓Six-layer blueprint with technology selections

02~6h

Ingestion → storage → serving

Kafka topology designed, schema contracts written, storage tiered, three-pattern serving layer architected, cost model finalized at every scale.

✓10 named Kafka topics + partition strategy + CDC plan
✓Avro / Protobuf schema contracts with BACKWARD compatibility
✓Hot/warm/cold storage tiering YAML + GDPR erasure design
✓Three-pattern serving design (Redis · Trino · Spark+dbt)
✓Feature store entity-key schema + freshness config
✓$34K/mo cost model at 1B/day with per-component breakdown

03~3h

Operations & defense

Observability designed, SLOs and burn-rate alerts defined, DR plan written, mock staff review presented and defended. INDEX of interview questions complete.

✓Five-pillar observability design
✓SLO + error-budget framework + alert rules YAML
✓DR plan (RTO 5 min, RPO 0) + incident runbook
✓Mock staff-review presentation with anticipated Q&A
✓INDEX of ~30 interview questions → 69 artifacts

Project setup · 5 minutes

Download your 69-artifact design portfolio.

There’s nothing to install — this is whiteboard-only. Grab the case-study bundle and open INDEX.md to start the 6-step interview rehearsal.

What lives in the bundle

Every artifact you’ll ship across the four parts, organized by part folder, with an INDEX mapping ~30 staff-level interview questions to the specific files that answer them.

part-1/ — 13 artifacts: requirements, capacity model, decision matrix
part-2/ — 18 artifacts: topic strategy, schema contracts, storage tiering
part-3/ — 20 artifacts: serving design, feature store, cost model
part-4/ — 18 artifacts: SLO framework, alert rules, DR plan
INDEX.md — ~30 interview questions → artifact map
interview-walkthrough.md — 6-step, 60-min staff-level rehearsal

Download · Case-Study Bundle

Uber Event Platform: Design Portfolio

69 artifacts, 11 YAML configs, INDEX of 30 interview questions, 6-step interview walkthrough. Reference material for staff-level system-design rounds.

~80 KB · 69 artifacts · 11 YAML configs · PRO required

~/portfolio/kafka-event-routing — zsh

1. Unzip the bundle

$ unzip kafka-event-routing-case-study.zip

$ cd kafka-event-routing-case-study

2. Open the INDEX

$ open INDEX.md # 30 interview questions → artifacts

3. Walk the interview rehearsal

$ open docs/interview-walkthrough.md # 6-step, 60-min staff drill

4. Browse artifacts by part

$ ls -R part-*/ # 69 artifacts across 4 parts

Artifacts

YAML configs

Interview Q&As

Walkthrough steps

Production hardening

The same blueprint — built for the real failure modes.

Most Kafka system-design write-ups stop at the happy path. The table below pairs every simplification we made on the whiteboard with what a real implementation would actually need — the answers a staff principal will press you on.

Tutorial designWhat we drew on the whiteboard

Schema compatibility

BACKWARD on every subject

Failover

Single Kafka cluster, multi-AZ

Storage lifecycle

S3 Standard → IA → Glacier

SLO scope

3 platform-wide burn-rate alerts

Observability

Five-pillar stack with Prometheus

GDPR erasure

Iceberg row-level delete

Production add-onWhat you’d ship next

✓

Schema compatibility

FULL on Tier-1, per-subject compatibility-mode enforced in CI

✓

Failover

MirrorMaker2 to passive region + DR runbook with quarterly failover drill

✓

Storage lifecycle

+ cross-region replication + erasure-aware rewrite_data_files

✓

SLO scope

Per-tenant SLOs (one Tier-1 contract per consumer) + budget chargeback

✓

Observability

+ OpenTelemetry distributed tracing for end-to-end event lineage

✓

GDPR erasure

+ erasure-aware materialized-view rebuild + downstream cache eviction

PRO benefit · design review

Real review from staff principals who run event platforms.

Submit your portfolio bundle, get the kind of pushback you’d hear in an actual staff loop — partition math, error-budget tradeoffs, vendor decisions, cost defense.

4 design reviews / month

Submit your portfolio bundle, a single artifact, or a redesign proposal. Reviewer is matched to your domain — Kafka / Iceberg / observability for this project. Async, comments inline, average turnaround 31 hours.

31h

avg turnaround

9.2/10

helpfulness

94%

return next month

2 mock staff interviews / month

Live 30-min sessions with a staff-level engineer. Defend your design against the questions you’ll actually hear: partition math, EOS guarantees, cost-vs-latency tradeoffs. Group sessions also available.

30 min

per session

2 / mo

included

+ group

unlimited

What PRO unlocks

One subscription. 15+ projects, all curriculum, design review.

PRO is built for senior+ engineers who want production-grade builds and feedback loops — not more tutorials.

What you getFREEPROEXPERT

Projects

Production-grade builds + design

15+

Curriculum modules

All 7 tracks

Phase 1 only

All

All + bonus

Review credits

Senior+ engineer review

4 / month

Unlimited

Career path access

5 paths × full plans

1 path

All 5

All 5 + 1:1

Certificate

Verifiable on LinkedIn

—

Yes

Yes + portfolio review

Community

Discord + office hours

Read-only

Full + 2/mo

Full + 4/mo

$29/mo

billed monthly · cancel anytime

or annual

$249/yr save 28%

Upgrade to PRO →

Who this is for

Pick this if you’re defending designs, not learning them.

Staff-track senior engineers

You’re prepping for the Uber / DoorDash / Stripe staff loop. You can ship a feature; what you need is the design vocabulary the system-design panel expects.

Tech leads driving streaming migration

You need to defend a Kafka platform redesign in front of leadership. Capacity math, SLA tiering, cost model — the parts you can’t afford to fudge.

Platform architects

You run streaming for 10+ teams. You want a reusable framework for event-platform decisions: topic taxonomy, partition counts, schema policies.

Senior engineers crossing batch → streaming

You know the warehouse cold; the streaming side feels like a different planet. This gives you the architecture grammar for routing 1B events/day.

Related curriculum

Going deeper? Four tracks back this project.

System-design vocabulary is the spine. These four curriculums let you go deeper on the layers the project designs but doesn’t implement.

FAQ

Quick answers.

Will I write any code?+

No — this is a design exercise. You’ll ship YAML schema contracts, decision matrices, capacity spreadsheets, alert rule files, and runbook drafts. The accompanying skill toolkits (Kafka Streams, Flink, Iceberg) are where you build the things this project designs.

Is module 01 actually free?+

Yes. Stakeholder interviews, 13 requirements decomposed, capacity estimation, and the architecture decision matrix. About 2-3 hours. By the end you can run the same exercise on a different domain.

How is this different from /projects/flink-fraud-detection?+

P01 is a build project: you write Flink code that runs on Kafka events. This project is the design portfolio — the staff-level reasoning that justifies why a Flink pipeline at all, what topics feed it, and what it costs at 1B/day. They pair: design here, build there.

What about KRaft, MirrorMaker2, or exactly-once details?+

Each is name-checked in the architecture but not deep-dived — they’re tools you reach for in implementation, and this project’s lane is the platform-shape decisions above them. The hardening section maps where each one would slot in.

Will this help me pass a staff DE interview?+

That’s the explicit target. The case-study bundle includes an INDEX of ~30 questions ('How do you size partitions for 1B events/day?', 'How do you defend $34K/mo platform cost?') each mapped to a specific artifact you produced. Plus a 6-step, 60-minute interview rehearsal.

What does PRO unlock for $29/mo?+

All 15+ PRO projects, 4 design-review credits per month, 2 mock-interview sessions, full curriculum across all 7 tracks, all 5 career paths, certificate of completion, and full community access. Cancel anytime.

Related projects

Paired with this project

P18·PAID·analytics

Staff+ leadership playbook

RFC → ADR → architecture review → blameless postmortem. The four artifacts a promo committee actually reads.

Explore project →

Ready to architect a real event platform?

Start with module 01 — free, no card. Decompose the requirements, run the capacity math, score Lambda vs Kappa vs Lakehouse on a weighted matrix. About 2-3 hours.

See PRO benefits

P02 · Uber Event Platform · PRO · module 01 freeUpgrade to PRO →

DesignUber's event platform— from 10K to 1B events/day

Senior+ system-design rounds got harder.

System design ≠ box drawing

Streaming-first orgs run on event platforms

Cost is the new tradeoff currency

GDPR + erasure changed the storage stack

Module 01 is free. The rest unlocks with PRO.

System Design for Data Engineers

Three sprints. Three checkpoints. One defended platform design.

Download your 69-artifact design portfolio.

What lives in the bundle

Uber Event Platform: Design Portfolio

The same blueprint — built for the real failure modes.

Real review from staff principals who run event platforms.

4 design reviews / month

2 mock staff interviews / month

One subscription. 15+ projects, all curriculum, design review.

Pick this if you’re defending designs, not learning them.

Staff-track senior engineers

Tech leads driving streaming migration

Platform architects

Senior engineers crossing batch → streaming

Going deeper? Four tracks back this project.

Event Design & Data Contracts

Kafka Streams Learning Path

Apache Iceberg & Modern Lakehouse Architecture

Cost Optimization for Data Engineers

Quick answers.

Paired with this project

Ready to architect a real event platform?

Design
Uber's event platform
— from 10K to 1B events/day