Build an E-commerce Analytics Platform with dbt
Your SQL queries work. But your CEO's dashboard is wrong — and you can't explain why. Build the production analytics platform that prevents that forever.
┌─────────────┐
│ raw_* │ Sources (5 tables)
│ customers │ orders, items,
│ products │ web_events
└──────┬──────┘
│
v
┌─────────────┐
│ stg_* │ Staging (5 models)
│ Clean & │ Type casting,
│ Rename │ deduplication
└──────┬──────┘
│
┌────┴────┐
v v
┌────────┐ ┌────────┐
│ int_* │ │ int_* │ Intermediate
│ orders │ │ metrics│ (4 models)
└───┬────┘ └───┬────┘
│ │
└────┬─────┘
v
┌─────────────┐
│ fact_* │ Marts (5 models)
│ dim_* │ Star schema
│ fct_* │ BI-ready
└──────┬──────┘
│
v
┌─────────────┐
│ snapshots │ SCD Type 2
│ incremental│ 8x faster
└─────────────┘Fig 1.1: dbt model lineage — 17 models across 5 layers
What You'll Build
A real analytics platform — not a tutorial project. Something you'd be proud to show in an interview.
5-Layer dbt Architecture
Sources → Staging → Intermediate → Marts → Snapshots. 17 production models following industry best practices.
E-commerce Star Schema
Fact tables and dimensions that answer real business questions: revenue by segment, cohort retention, product performance.
Incremental + Historical Tracking
Models that refresh in 30 seconds (not 4 minutes) and SCD Type 2 snapshots that capture customer history over time.
Production CI/CD Pipeline
GitHub Actions with Slim CI, automated testing, monitoring alerts — the same setup used at companies like Airbnb and Stripe.
Progressive Build Path
Each part builds on the previous. Watch your analytics platform grow from 5 to 17 models.
Seed Your Database
ShopCo's raw e-commerce data — 3M+ records across 5 tables
# Clone the project & seed ShopCo data$ git clone https://github.com/aide/shopco-analytics.git$ cd shopco-analytics# Install dbt & seed raw data into your warehouse$ pip install dbt-postgres$ dbt seed --select shopco_raw# Verify: 5 tables, 3M+ records loaded$ dbt source freshness
customers
100K
orders
500K
order_items
2M
products
10K
web_events
5M
Production Standards
CI/CD Automation
GitHub Actions pipeline with Slim CI for pull requests, automated testing, and production deployment workflows.
Data Governance
Source freshness checks, schema tests, unique/not-null constraints, and documented column descriptions across all models.
Modular Architecture
5-layer design (Sources → Staging → Intermediate → Marts → Snapshots) following dbt best practices and naming conventions.
Prerequisites
- SQL fundamentals (SELECT, JOIN, GROUP BY, window functions)
- Command line basics (running commands, navigating dirs)
- Git basics (commit, push, pull requests)
- E-commerce metrics understanding (helpful, not required)
Related Learning Path
This project pairs perfectly with the dbt Fundamentals skill toolkit. Complete the modules first for maximum understanding, or dive straight in if you have prior dbt experience.
View dbt Skill ToolkitReady to build?