Build the
modeling layer
every dashboard sits on.
ShopCo's CEO just got three different revenue numbers from three teams. Build the dbt project that fixes it: 17 tested models in 5 layers (sources → staging → intermediate → marts → snapshots), a fact_orders + dim_customers + dim_products star schema, incremental refreshes on the high-volume tables, an SCD Type 2 customer snapshot, and a Slim CI + dbt Cloud + Slack alerting deploy pipeline.
The “walk me through how you’d model the analytics layer for an ecommerce platform” question — asked in nearly every analytics-engineer take-home and on-site round.
- A 17-model dbt project (5 staging + 2 intermediate + 5 marts + 1 snapshot + sources.yml)
- fact_orders + dim_customers + dim_products star schema at one-row-per-order grain
- fct_customer_cohorts + fct_product_performance + fct_customer_ltv (RFM-scored)
- Incremental materialization on fact_orders + stg_web_events with state-aware refresh
- SCD Type 2 customers_snapshot with dbt_valid_from / dbt_valid_to
- Two GitHub Actions workflows (Slim CI on PR + production deploy on merge) wired to dbt Cloud + Slack
Every part is fully unlocked on the free plan.
Free-tier project — full path from raw ShopCo seeds to a Slim-CI-gated, dbt-Cloud-deployed analytics modeling layer. No paywall, no card required.
dbt & Analytics Engineering
The dbt & Analytics Engineering path covers the primitives — this is the deeper dbt project that composes them into a real production codebase on a real ecommerce dataset.
Three sprints. Three checkpoints. One trustworthy modeling layer.
Each phase ends with a tagged commit, a passing dbt test suite, and an artifact a senior analytics engineer would actually accept.
5 staging models clean and standardize the ShopCo raw tables. sources.yml documents lineage. 4 test types green across every model.
- ✓stg_customers / stg_orders / stg_order_items / stg_products / stg_web_events
- ✓sources.yml + per-model schema tests (unique, not_null, accepted_values, relationships)
- ✓dbt run + dbt test passing on the dev schema
Star schema at one-row-per-order grain. Two dimensions with lifetime + tier metrics. Three fact-style marts (cohorts, product performance, LTV with RFM scoring).
- ✓fact_orders + dim_customers + dim_products (documented grain + tested PKs)
- ✓fct_customer_cohorts + fct_product_performance for trend analysis
- ✓fct_customer_ltv with RFM-bucketed churn risk
Incremental on the high-volume tables. SCD2 customer snapshot. dbt Cloud + Slim CI + production deploy + Slack + GitHub Pages docs. Branch protection + schema isolation across dev/staging/prod.
- ✓Incremental on fact_orders + stg_web_events with state-aware refresh
- ✓customers_snapshot (SCD2) + Snowflake cluster_by / BigQuery partition_by examples
- ✓Slim CI + production deploy GH Actions + Slack webhook + dbt docs hosted
One starter kit. 17 models, 5 seeds, dbt Cloud-ready.
The starter kit ships a complete dbt project skeleton with 5 seeded ShopCo CSVs (~10K rows total), profiles + sources + a sample GitHub Actions workflow — so you can dbt seed → dbt run → dbt test on Day One without hand-typing schemas.
What lives in the repo
Everything you need to run all 4 parts on Snowflake (or any dbt-supported warehouse), plus the ShopCo seed CSVs that simulate a multi-category retailer at the row counts this project assumes.
- models/ — 17 dbt models across staging / intermediate / marts (with .yml schemas)
- seeds/ — 5 ShopCo CSVs (customers, orders, order_items, products, web_events) at ~10K rows total
- snapshots/ — customers_snapshot SCD Type 2 spec
- .github/workflows/ — Slim CI on PR + production deploy on merge (state:modified+ --defer)
- dbt_project.yml — materialization defaults + path config + per-layer schemas
- profiles.yml.example — Snowflake/BigQuery/Postgres connection templates
ShopCo Analytics Starter Kit
Pre-built dbt project with 17 models, 5 ShopCo seed CSVs, GitHub Actions Slim CI workflow, and profiles templates. dbt seed → dbt run → dbt test in under 5 minutes. Skip the boilerplate, start on Part 01.
Three things you can put on your résumé.
Star schema with documented grain
fact_orders at one-row-per-order, conformed dim_customers + dim_products with tested PKs, and trend marts (cohorts, product performance) layered on top — the model every BI tool actually wants to query from.
Incremental + SCD2 + RFM scoring
Convert fact_orders to incremental with unique_key + state-aware refresh. Snapshot dim_customers via dbt snapshot (strategy='check'). Score LTV with RFM buckets — the patterns that turn a tutorial repo into a production warehouse.
dbt Cloud + Slim CI + production deploy
Two GitHub Actions workflows (Slim CI on PR with state:modified+ --defer; production deploy on merge). dbt Cloud schedule, Slack webhook alerts, dbt docs hosted on GitHub Pages, branch protection + dev/staging/prod schema isolation.
The same models — but built for the real warehouse.
Tutorials run a full rebuild on every commit, against a single schema, with manual Slack pings when something breaks. Production looks different — incremental keys, deferred state, snapshot strategies, and CI that only touches what actually changed. Here’s the diff, with the dbt primitives you reach for.
materialized='incremental' with unique_key='order_id' and is_incremental() guarddbt snapshot with strategy='check' + check_cols emitting dbt_valid_from/dbt_valid_todbt run --select state:modified+ --defer --state ../ — only changed models, deferred to prod stategenerate_schema_name macro routes to DEV_username / STAGING / PRODdbt source freshness with warn_after / error_after per source — fail CI if upstream is staleon-run-end hook + Slack webhook + dbt Cloud notification channels per environmentPick this if you want the first dbt project on your résumé to be a real one.
Analytics engineers
You've used dbt at work but only on someone else's repo. This is the project that lets you defend grain decisions, materialization choices, and CI patterns from first principles.
Junior data engineers
You can write SQL but the analytics-engineer ladder is unfamiliar. This is the cleanest first dbt project — staging → marts → snapshots → Slim CI, on a domain everyone understands.
Career changers / first dbt project
You finished a SQL course and the next step is fuzzy. dbt + ecommerce + GH Actions is the portfolio piece recruiters actually open and read.
Interview prep
Take-homes increasingly ship a CSV and ask for a dbt project. After this you can produce one in an afternoon and defend the modeling decisions in the followup call.
Going deeper? Two tracks back this project.
The dbt curriculum is the spine (linked above). These two let you go deeper on the layers next to it — SQL fluency this project assumes, and the advanced data-modeling theory behind the star schema + SCD2 patterns.
Quick answers.
Ready to build the layer everyone queries from?
Part 01 takes about 2 hours. By the end you'll have dbt + the ShopCo seeds running, 5 staging models clean, and 4 test types green across every model.