Skip to content
Back to Projects
Hands-On Project
~14 hours·4 parts

Staff Data Engineer Playbook: System Design & Leadership

Master the soft skills and system design frameworks required for senior/staff roles. Write technical RFCs, defend architecture tradeoffs, handle stakeholder pushback, and lead incident postmortems.

Staff EngineerTech LeadSenior DEEngineering Manager
View on GitHub

Review Process Flow

DESIGN
DEFEND
CONSENSUS
POSTMORTEM
RFC · Mermaid · Scope
ADR · TCO · Matrices
RACI · Facilitation · Dissent
5 Whys · RCA · SLOs

What You'll Build

1

Foundation — Writing the Design Document

3–4 hours

Author a production-quality technical design document from scratch — problem statement, scope boundaries, architecture options with tradeoff matrices, data flow diagrams, risk registers, and implementation timelines. Follow RFC formats used at Netflix, Uber, and Google.

Complete RFC/design document following industry-standard formats
Problem statement with scope boundaries and non-goals
Architecture options matrix (3+ alternatives with scored tradeoffs)
Data flow and sequence diagrams (Mermaid + whiteboard)
Risk register with mitigations and probability scoring
Implementation timeline with milestones and dependencies
6/6 items complete — Design document ready for review
2

Analysis — Defending Technical Tradeoffs

3–4 hours

Build rigorous Architecture Decision Records (ADRs), construct cost-benefit analyses with TCO modeling, prepare stakeholder-specific communication strategies, and anticipate the hardest questions your reviewers will ask.

Architecture Decision Record (ADR) catalog with 5+ decisions
Cost-benefit analysis with 3-year TCO modeling
Tradeoff defense matrices (latency, cost, complexity, reliability)
Stakeholder communication plans (engineering, product, leadership)
Objection playbook — top 10 anticipated questions with prepared answers
Decision log with context, rationale, and consequences
12/12 items complete — ADR catalog with defended decisions
3

Leadership — Handling Pushback & Building Consensus

3–4 hours

Simulate an architecture review meeting where you facilitate technical disagreements, navigate strong opinions from senior engineers, build consensus across engineering and product teams, and document decisions with dissent.

Architecture review meeting facilitation script
RACI matrix for decision ownership across teams
Disagreement resolution framework (disagree-and-commit model)
Escalation playbook for unresolved technical conflicts
Consensus documentation with recorded dissenting opinions
Follow-up action items with owners, deadlines, and success criteria
18/18 items complete — Architecture review facilitated and documented
4

Crisis — Incident Postmortem Simulation

3–4 hours

Reconstruct a realistic data pipeline incident from timeline to root cause, write a blameless postmortem using Google/Meta formats, define SLO-driven action items, and present findings to engineering leadership.

Incident timeline reconstruction with 5 Whys analysis
Root cause analysis using Fishbone and Fault Tree methods
Blameless postmortem document (Google SRE format)
Action item tracker with SLO-driven prioritization
Leadership briefing deck (5-slide executive summary)
Recurring incident prevention framework with monitoring triggers
Production postmortem completed and presented

Skills This Project Reinforces

Technical Leadership

M3: Design Docs, M4: Architecture Reviews

Communication

Stakeholder Messaging, Executive Summaries

Decision Making

ADRs, Tradeoff Analysis, Disagree-and-Commit

Incident Management

Postmortems, Root Cause Analysis, SLOs

Cross-Team Collaboration

RACI, Consensus Building, Escalation

System Design

Architecture Options, Data Flow Diagrams

Tools & Formats

Markdown
Documentation
Mermaid
Diagrams
YAML
Config
Python
Scripting
Git
Version Control
Confluence
Wiki
Notion
Knowledge Base
Jira
Tracking
Excalidraw
Whiteboard
Grafana
Monitoring

Starter Templates

design_doc_template.md15 KB · RFC template

Industry-standard design document template based on Netflix and Uber RFC formats with section prompts

adr_catalog_template.yaml8 KB · ADR framework

Architecture Decision Record template with status lifecycle, context fields, and consequence tracking

postmortem_template.md12 KB · Google SRE format

Blameless postmortem template following Google SRE standards with timeline, impact, and action item sections

incident_scenario.json5 KB · 1 scenario

Realistic data pipeline incident scenario with timestamps, logs, alerts, and team communication threads

Resume-Ready Bullets

Authored technical design documents for data platform migrations following Netflix RFC format, presenting 3+ architecture alternatives with scored tradeoff matrices to cross-functional review boards

Established Architecture Decision Record (ADR) practice across 3 engineering teams, documenting 50+ technical decisions with cost-benefit analyses reducing re-litigation of past decisions by 70%

Facilitated architecture review meetings for major system redesigns, navigating disagreements across 4 teams and driving consensus using disagree-and-commit framework with documented dissent

Led blameless incident postmortem process for data pipeline failures, implementing SLO-driven action items that reduced recurring incidents by 60% and MTTR from 4 hours to 45 minutes

Related Learning

Ready to Build Your Staff Engineer Leadership Portfolio?

This project builds the leadership artifacts that get you promoted — design docs, ADRs, consensus protocols, and postmortems used at top-tier companies.

Press Cmd+K to open