Skip to content
ai-de.net/Projects/P12 · CI/CD Data Platform
PRO · module 01 free previewPlatform trackP12

Build the
platform under
the platform — CI/CD for data

The DevOps spine for a 15-engineer dbt + Snowflake team: trunk-based Git, dbt tests in GitHub Actions with PR-scoped schemas, blue/green deploys via atomic view swaps, Snowflake time-travel rollback, and Terraform-managed warehouses + roles + grants.

Timeline
20-26 hours
Difficulty
Senior+
Stack
GitHub Actions · dbt · Snowflake · Terraform

The platform-engineer system-design round — when an interviewer asks how you’d ship dbt safely for a team of 15, this is the project that lets you walk the whole pipeline without hand-waving any layer.

By the end you will have shipped
  • Trunk-based Git workflow: pre-commit (SQLFluff + dbt compile + secret scan), CODEOWNERS, branch protection
  • GitHub Actions dbt CI with generic + custom + singular tests, PR-scoped CI schemas, and Slim CI on state:modified+
  • Blue/green deploys on Snowflake: BLUE/GREEN schemas, atomic view swap inside BEGIN/COMMIT, DEPLOY_STATE audit
  • Time-travel rollback playbook: <60s view re-swap + Snowflake AT/BEFORE for point-in-time recovery
  • Terraform-managed Snowflake: warehouses, role hierarchy (LOADER / TRANSFORMER / REPORTER), grants across dev/staging/prod
  • Multi-stage Docker image + freshness SLA checks + SEV runbooks for the on-call rotation
PREREQComfortable with Git PR flow, dbt build, and basic SQL DDL. Snowflake account helpful but not required — Part 1 walks setup. Docker + Terraform get introduced from scratch in Part 4.
pipelineops.platform.* · deploy 7d3f1a → GREEN
blue/green active
PR · CI
Blue / Green
Production
SLA · SEV
feature/PRtrunk-based
pre-commitsqlfluff · compile · scan
GH Actionsdbt build on PR
Slim CIstate:modified+
CODEOWNERS · branch protection
BLUEACTIVE · prod views
GREENBUILDING · dbt run
DEPLOY_STATEactive=BLUE · sha=7d3f1a
atomic view swap
PROD.views→ BLUE.* (today)
dashboardsrevenue · ops · ML
0 downtime swap
all consumers re-point
freshness≤ 24h check
anomalyrow count ±20%
SEV1 runbook≤ 5 min ack
audit logDEPLOY_STATE
99.9% SLO target
# Atomic view swap (Module 03)
BEGIN;
CREATE OR REPLACE VIEW PROD.fct_orders
AS SELECT * FROM GREEN.fct_orders;
COMMIT; → 0 downtime, all views re-point
● Time-travel rollback (Module 03)
CREATE OR REPLACE TABLE PROD.fct_orders AS
SELECT * FROM PROD.fct_orders
AT (TIMESTAMP => deploy_started_at);
→ <60s point-in-time recovery
<60s
rollback
0
downtime swaps
3
envs as code
Why CI/CD for data, why now

The #1 skill gap between data engineer and platform engineer.

Most data teams still ship dbt by hand — `dbt run --target prod` from someone's laptop, hope for the best. Mature orgs (Netflix, Stripe, Airbnb, dbt Labs) treat data infra like application infra: PRs, automated tests, blue/green, rollback. This is the project that proves you can build that bar.

PRs, not laptop pushes

Trunk-based flow with pre-commit, CODEOWNERS, and required CI checks. Every change is reviewed and tested before main.

Slim CI keeps PRs <5 min

Run only the dbt models touched by the PR via state:modified+. Manifest caching cuts a 45-min full-build to a 4-min selective run.

Zero-downtime deploys

Atomic view swap inside BEGIN/COMMIT. Production reads stay pointed at BLUE while GREEN builds, then re-point in one transaction.

Rollback you can trust

Time-travel via Snowflake AT/BEFORE plus a re-swap of production views — sub-minute recovery, audited in DEPLOY_STATE.

Curriculum · 4 modules · 20-26 hours

Module 01 is free. The rest unlocks with PRO.

Try Module 01 — set up the Git foundation, configure dev/staging/prod profiles, install pre-commit, validate environment parity. About 4 hours. If it clicks, upgrade to unlock the CI, blue/green, and Terraform modules.

P12 · 20-26 hours · 4 modules
Free preview PRO required
Module 01 is free — no card required. Lay the Git + environment foundation before paying.
M01
Foundation: Git workflow + dev/staging/prod environments
Trunk-based Git for data teams. Data-aware .gitignore + .gitattributes (Git LFS for seed CSVs). CODEOWNERS + PR template + branch protection. Pre-commit hooks (SQLFluff, dbt compile, secret scan). dbt profiles.yml with isolated dev/staging/prod schemas. validate_environments.py for parity checks.
4-5h6 lessonsFREE PREVIEW
Start →
M02
Automation: dbt CI on GitHub Actions + quality gates
dbt test suite (generic + custom + singular tests with reconciliation). GitHub Actions dbt-ci.yml running dbt build on PR. PR-scoped CI schemas + cleanup workflow on PR close. Slim CI with state:modified+ and manifest caching. Test-result reporting back to PR comments.
5-6h8 lessonsPRO TIER
Unlock with PRO →
M03
Deployment: blue/green view swaps + time-travel rollback
BLUE/GREEN schema pattern + DEPLOY_STATE audit table. BlueGreenDeployer Python class. Pre-cutover validation (row count, null PKs, freshness, schema compatibility). Atomic view re-point inside BEGIN/COMMIT. Snowflake time-travel rollback (AT/BEFORE) and the orchestrator that ties build → validate → swap → notify together.
5-7h8 lessonsPRO TIER
Unlock with PRO →
M04
Production: Terraform IaC + SLA monitoring + SEV runbooks
Terraform Snowflake provider — warehouses with auto-suspend tiering, role hierarchy (LOADER / TRANSFORMER / REPORTER), grants per environment. Multi-stage Dockerfile + docker-compose for local parity. Freshness SLA checks + row-count anomaly macro. SEV runbook templates and the deployment audit log model.
5-6h8 lessonsPRO TIER
Unlock with PRO →
3 modules locked · Unlock all PRO content for $29/mo
Upgrade to PRO →
Backed by curriculum

DataOps: CI/CD & Infrastructure as Code

8 modules·14 hours·Git for data·PR-scoped CI·blue/green·rollback·Terraform
Open curriculum

This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.

The build, in 3 phases

Three sprints. Three checkpoints. One production-grade pipeline.

Each phase ends with a tagged commit and a runnable artifact. No ambiguity about where you are.

01~9h
Foundation + automation

Trunk-based Git with pre-commit + branch protection (Module 01). dbt CI on GitHub Actions with PR-scoped schemas + Slim CI cutting full-build time to a selective <5-min run (Module 02).

  • Pre-commit + CODEOWNERS + dev/staging/prod profiles
  • dbt CI workflow with PR-scoped schemas
  • Slim CI on state:modified+ with manifest cache
02~6h
Blue/green deploy + rollback

BLUE/GREEN schema pattern with atomic view swap inside BEGIN/COMMIT. Pre-cutover checks. Time-travel rollback + DEPLOY_STATE audit. Orchestrator chains the whole pipeline.

  • BlueGreenDeployer + DEPLOY_STATE table
  • Pre-cutover validation framework
  • Time-travel rollback + orchestrator.py
03~6h
Terraform + SLA hardening

Snowflake-as-code via Terraform: warehouses, RBAC role hierarchy, grants per env. Multi-stage Docker image. Freshness SLA + row-count anomaly checks. SEV runbooks + deployment audit model.

  • terraform/ for warehouses + roles + grants
  • Multi-stage Dockerfile + docker-compose
  • Freshness SLA + SEV runbooks + audit model
Project setup · 10 minutes

One repo. dbt + Snowflake + GitHub Actions + Terraform.

The starter kit ships the full PipelineOps Inc repo — dbt project skeleton, GitHub Actions workflows, blue/green Python orchestrator, Terraform modules, sample CSVs with intentional data-quality issues, and the seed schema for Snowflake.

What lives in the repo

Everything you need to build the platform under the platform — dbt, CI workflows, deploy scripts, IaC, and seed data with deliberate quality bugs so your CI gates catch real failures.

  • models/ + tests/ — staging/intermediate/marts dbt project with generic, custom, and singular tests
  • .github/workflows/ — dbt-ci.yml, dbt-slim-ci.yml, env-parity-check.yml, schema cleanup, manifest cache
  • deploy/ — blue_green.py, pre_cutover_checks.py, rollback.py, backfill.py, orchestrator.py
  • terraform/ — warehouses.tf, roles.tf, grants per dev/staging/prod with remote state backend
  • Dockerfile + docker-compose.yml — multi-stage build (Python 3.11 slim) + local-parity stack
  • runbooks/ + scripts/ — SEV templates + check_freshness.py + chaos_test.py skeletons
Download · Starter Kit

CI/CD Data Platform Starter Kit

Pre-built repo with dbt project skeleton, GitHub Actions workflows, blue/green orchestrator, Terraform modules, SEV runbook templates, and 14,800 rows of seed data with intentional QA issues.

Pro project · ~250 KB · 73 files · 4 sample CSVs (14.8k rows total)
~/projects/cicd-data-platform — zsh
1. Clone and install pre-commit hooks
$ git clone github.com/ai-de/p12-cicd-data-platform
$ cd p12-cicd-data-platform && pre-commit install
2. Configure dbt profiles + verify environment parity
$ cp profiles.yml.example ~/.dbt/profiles.yml
$ python scripts/validate_environments.py --target dev
3. Run the dbt CI suite locally (matches what GitHub Actions runs)
$ dbt seed --target ci
$ dbt build --target ci --select state:modified+ --state ./target
4. Provision Snowflake objects via Terraform
$ cd terraform && terraform init
$ terraform plan -var-file=dev.tfvars
$ terraform apply -var-file=dev.tfvars
5. Run a blue/green deploy + simulate rollback
$ python deploy/orchestrator.py --target green --git-sha $(git rev-parse HEAD)
$ python deploy/rollback.py --deploy-id <id> --strategy view-swap
1.5k
customers
300
products
5k
orders
8k
events
What changes vs a hand-deployed dbt project

The same dbt project — but with the safety net a 15-engineer team needs.

Most teams ship dbt by hand: dbt run --target prod from a laptop, no review, no rollback. The patterns in this project — state:modified+, atomic view swaps, time travel, Terraform-managed RBAC — are what unlocks safe parallelism without breaking the warehouse.

Hand-deployed versionWhat most teams have today
×
Deploys
dbt run --target prod from a laptop
×
Tests
Run locally, sometimes, when remembered
×
PR validation
Manual SQL review, no CI
×
Rollback
git revert + redeploy + cross fingers
×
Schema isolation
Single PROD schema for everything
×
Snowflake config
Click-Ops in the UI; drift everywhere
×
Incident response
Slack message; no runbook, no SLO
Your CI/CD platform versionModule 02–04
Deploys
GitHub Actions runs dbt build --target prod on merge, audit-logged
Tests
Generic + custom + singular tests run on every PR, blocking merges
PR validation
dbt build in PR-scoped CI_PR_N schema, cleaned on close
Rollback
View re-swap + AT/BEFORE time travel — sub-minute, recorded in DEPLOY_STATE
Schema isolation
BLUE/GREEN schemas; production reads see only the active one
Snowflake config
Terraform modules for warehouses, roles, grants — plan/apply per env
Incident response
SEV1/SEV2 runbooks + freshness SLA checks + deployment audit log
PRO benefit · code review

Real review from senior engineers who shipped this stack.

Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.

CR

4 reviews / month

Submit a repo, a PR, or a Terraform plan. Reviewer is matched to your domain — dbt + GitHub Actions + Snowflake for this project. Async, comments inline, average turnaround 31 hours.

31h
avg turnaround
9.2/10
helpfulness
94%
return next month
OH

2 office hours / month

Live 30-min sessions with a senior platform engineer. Architecture questions, walk a tricky deploy migration, mock a system-design interview on CI/CD for data. Group sessions also available.

30 min
per session
2 / mo
included
+ group
unlimited
What PRO unlocks

One subscription. 15+ projects, all curriculum, code review.

PRO is built for engineers who want production-grade builds and feedback loops — not more tutorials.

What you getFREEPROEXPERT
Projects
Production-grade builds
2
15+
8
Curriculum modules
All 7 tracks
Phase 1 only
All
All + bonus
Code review credits
Senior engineer review
0
4 / month
Unlimited
Career path access
5 paths × full plans
1 path
All 5
All 5 + 1:1
Certificate
Verifiable on LinkedIn
Yes
Yes + portfolio review
Community
Discord + office hours
Read-only
Full + 2/mo
Full + 4/mo
$29/mo
billed monthly · cancel anytime
or annual
$249/yr save 28%
Upgrade to PRO
Who this is for

Pick this if your dbt project is load-bearing for someone else’s revenue.

DE

Data engineers going senior+

You ship dbt every day but deploys are still hand-rolled. This is the project that turns 'I can write models' into 'I can run a 15-engineer dbt team safely.'

AE

Analytics engineers ready to own delivery

You can build the marts. Now you want to own the pipeline that ships them — PRs, CI, blue/green — without waiting for a platform team to build it for you.

PE

Platform engineers entering data

You've shipped CI/CD for app code. Data is a new shape — atomic view swaps instead of containers, time travel instead of revert. This translates the patterns.

EM

Eng managers and tech leads

You're the one writing the SLA and the SEV runbook. This gives you the reference architecture and the language to argue for it in your roadmap reviews.

FAQ

Quick answers.

No — and we used to advertise that, which was misleading. The deploy substrate in this project is GitHub Actions + Docker + Terraform-managed Snowflake. There's an Airflow DAG-versioning pattern shown in Module 03, but no scheduler/worker setup. Kubernetes and Helm aren't in scope; they get a dedicated platform project later in the catalog.
No — it's the SLO target the runbooks are designed around. Module 04 walks you through the freshness checks, row-count anomalies, and SEV escalation tiers that a 99.9%-targeted platform needs. The project doesn't run a 30-day production load to measure uptime — that's not a tutorial-scale exercise.
DataGuard goes deep on the observability layer (freshness, anomaly, lineage, alerting). This project goes deep on the deploy spine (PRs, CI, blue/green, rollback, IaC). They sit next to each other — your CI/CD platform is what catches breaking changes before they ship; DataGuard is what catches breakages after they ship. Both are in the platform track and most senior+ engineers do both.
Helpful but not required. Module 01 walks setup (free Snowflake trial works), and the dbt + Python deploy code is testable locally with the Docker Compose stack. You won't get the full blue/green time-travel experience without a Snowflake account, but you can build and test everything else.
Real. Module 02 ships dbt-ci.yml, dbt-slim-ci.yml, an env-parity-check workflow, a PR-scoped schema cleanup workflow, and a manifest-cache workflow — all runnable in your fork. They're tutorial-scoped (small repo, single Snowflake account) but the patterns transfer directly to a 150-model production repo.
All 15+ PRO projects, 4 code-review credits per month, 2 office-hours sessions, full curriculum across all 7 tracks, all 5 career paths, certificate of completion, and full community access. Cancel anytime.

Ready to build the platform under the platform?

Start with Module 01 — free, no card. About 4 hours. By the end you'll have trunk-based Git, pre-commit hooks, dev/staging/prod dbt profiles, and a parity-validated environment foundation. If it clicks, upgrade to unlock CI, blue/green, and Terraform modules.

P12 · CI/CD Data Platform · PRO · module 01 freeUpgrade to PRO →
Press Cmd+K to open