Skip to content
Back to Projects
Hands-On Project
~16 hours·4 parts

Experimentation & A/B Testing Platform

Architect a reliable data foundation to compute A/B testing metrics, statistical significance, and product KPIs with zero discrepancies.

Analytics EngineerProduct DESenior DE
View on GitHub

Architecture Overview

EVENTS
COMPUTE
ANALYZE
GOVERN
Kafka · Schema Registry
dbt · SciPy · Airflow
Python · Great Expectations
FastAPI · Redis · Docker

What You'll Build

1

Foundation — Event Modeling & Assignment Infrastructure

3–4 hours

Design production-grade event schemas for experimentation, build dimensional models for experiments and variants, implement a deterministic assignment logging pipeline, and integrate feature flag data into the warehouse.

Event schema for variant assignments & user actions
Dimensional model (dim_experiments, dim_variants, fact_assignments)
Assignment logging pipeline with deterministic bucketing
Feature flag data model & integration layer
Event validation with schema registry
Backfill-safe incremental assignment pipeline
6/6 items complete — Assignment infrastructure operational
2

Analytics — Metric Computation Engine

4–5 hours

Build a metric computation engine that joins assignments to outcomes, implements KPI hierarchy with North Star metrics, computes statistical significance with confidence intervals, and monitors guardrail metrics for every experiment.

KPI hierarchy model (North Star → leading → lagging indicators)
Metric computation pipeline (pre/in/post-experiment)
Statistical significance calculator (p-values, CI, MDE)
Guardrail metric monitoring system
Novelty & primacy effect detection
Metric definition registry (single source of truth)
12/12 items complete — Metric engine computing results
3

Intelligence — A/B Test Analytics & Decision Framework

3–4 hours

Build automated experiment analysis pipelines with segment breakdowns, detect sample ratio mismatches (SRM), implement an automated ship/no-ship decision engine, and define metric ownership contracts across teams.

Experiment results aggregation pipeline
Segment analysis engine (cohort, geo, device, platform)
Sample ratio mismatch (SRM) detection system
Automated decision engine (ship/hold/extend)
Metric ownership contracts & data lineage
Experiment results dashboard data layer
18/18 items complete — Decision framework operational
4

Production — Platform Governance & Scale

3–4 hours

Deploy a production experimentation platform with lifecycle management, real-time feature flag serving, KPI consistency enforcement, experiment governance with approval workflows, and platform monitoring with SLAs.

Experiment lifecycle state machine (draft → running → concluded)
Real-time feature flag serving layer
KPI consistency framework (single source of truth)
Experiment governance (approvals, exclusion groups, traffic)
Platform monitoring & freshness SLAs
Enterprise experiment catalog & search
LIVE experimentation platform deployed

Skills This Project Reinforces

Product Thinking

M3: A/B Testing, M4: Experimentation Infra

Data Modeling

M2: Dimensional Modeling, M5: Advanced Patterns

System Design

M3: Ingestion, M5: Serving & Analytics

Data Observability

Freshness SLAs, Metric Monitoring

Cost Optimization

Query Optimization, Warehouse Sizing

CI/CD & Deployment

Pipeline CI, Blue/Green Deploys

Tech Stack

dbt Core
Transform
Snowflake
Warehouse
Python
Language
Great Expectations
Quality
Apache Kafka
Streaming
Airflow
Orchestration
FastAPI
Serving
Redis
Cache
SciPy
Statistics
Plotly
Visualization
OpenLineage
Lineage
Docker
Containers

Sample Datasets

raw_assignments.csv35 MB · 250K records

User-to-variant assignment events with timestamps and bucketing metadata

raw_user_events.csv52 MB · 500K records

Clickstream and conversion events tied to experiment-eligible users

raw_experiments.csv1 MB · 200 records

Experiment definitions with hypothesis, variants, traffic allocation, and dates

raw_feature_flags.csv500 KB · 2K records

Feature flag configurations with rollout rules and targeting criteria

Resume-Ready Bullets

Built end-to-end experimentation data platform supporting 200+ concurrent A/B tests with deterministic assignment logging, metric computation engine, and automated ship/no-ship decisions

Designed dimensional data model for experimentation (fact_assignments, fact_metric_events, dim_experiments) with feature flag integration processing 500K+ daily events

Implemented statistical significance pipeline computing p-values, confidence intervals, and minimum detectable effects across 50+ KPIs with guardrail metric monitoring

Created KPI consistency framework enforcing single-source-of-truth metric definitions across analytics, product, and data science teams with automated drift detection

Ready to Build Your Experimentation Platform?

This project gives you the cross-functional foundation that separates senior data engineers from everyone else — understanding how experimentation drives product decisions.

Press Cmd+K to open