ai-de.net/Projects/P12 · CI/CD Data Platform

PRO · module 01 free previewPlatform trackP12

Build the
platform under
the platform — CI/CD for data

The DevOps spine for a 15-engineer dbt + Snowflake team: trunk-based Git, dbt tests in GitHub Actions with PR-scoped schemas, blue/green deploys via atomic view swaps, Snowflake time-travel rollback, and Terraform-managed warehouses + roles + grants.

Timeline

20-26 hours

Difficulty

Senior+

Stack

GitHub Actions · dbt · Snowflake · Terraform

See PRO benefits

The platform-engineer system-design round — when an interviewer asks how you’d ship dbt safely for a team of 15, this is the project that lets you walk the whole pipeline without hand-waving any layer.

By the end you will have shipped

Trunk-based Git workflow: pre-commit (SQLFluff + dbt compile + secret scan), CODEOWNERS, branch protection
GitHub Actions dbt CI with generic + custom + singular tests, PR-scoped CI schemas, and Slim CI on state:modified+
Blue/green deploys on Snowflake: BLUE/GREEN schemas, atomic view swap inside BEGIN/COMMIT, DEPLOY_STATE audit
Time-travel rollback playbook: <60s view re-swap + Snowflake AT/BEFORE for point-in-time recovery
Terraform-managed Snowflake: warehouses, role hierarchy (LOADER / TRANSFORMER / REPORTER), grants across dev/staging/prod
Multi-stage Docker image + freshness SLA checks + SEV runbooks for the on-call rotation

PREREQComfortable with Git PR flow, dbt build, and basic SQL DDL. Snowflake account helpful but not required — Part 1 walks setup. Docker + Terraform get introduced from scratch in Part 4.

pipelineops.platform.* · deploy 7d3f1a → GREEN

blue/green active

PR · CI

Blue / Green

Production

SLA · SEV

feature/PRtrunk-based

pre-commitsqlfluff · compile · scan

GH Actionsdbt build on PR

Slim CIstate:modified+

CODEOWNERS · branch protection

BLUEACTIVE · prod views

GREENBUILDING · dbt run

DEPLOY_STATEactive=BLUE · sha=7d3f1a

atomic view swap

PROD.views→ BLUE.* (today)

dashboardsrevenue · ops · ML

0 downtime swap

all consumers re-point

freshness≤ 24h check

anomalyrow count ±20%

SEV1 runbook≤ 5 min ack

audit logDEPLOY_STATE

99.9% SLO target

# Atomic view swap (Module 03)

BEGIN;

CREATE OR REPLACE VIEW PROD.fct_orders

AS SELECT * FROM GREEN.fct_orders;

COMMIT; → 0 downtime, all views re-point

● Time-travel rollback (Module 03)

CREATE OR REPLACE TABLE PROD.fct_orders AS

SELECT * FROM PROD.fct_orders

AT (TIMESTAMP => deploy_started_at);

→ <60s point-in-time recovery

<60s

rollback

downtime swaps

envs as code

Why CI/CD for data, why now

The #1 skill gap between data engineer and platform engineer.

Most data teams still ship dbt by hand — `dbt run --target prod` from someone's laptop, hope for the best. Mature orgs (Netflix, Stripe, Airbnb, dbt Labs) treat data infra like application infra: PRs, automated tests, blue/green, rollback. This is the project that proves you can build that bar.

PRs, not laptop pushes

Trunk-based flow with pre-commit, CODEOWNERS, and required CI checks. Every change is reviewed and tested before main.

Slim CI keeps PRs <5 min

Run only the dbt models touched by the PR via state:modified+. Manifest caching cuts a 45-min full-build to a 4-min selective run.

Zero-downtime deploys

Atomic view swap inside BEGIN/COMMIT. Production reads stay pointed at BLUE while GREEN builds, then re-point in one transaction.

Rollback you can trust

Time-travel via Snowflake AT/BEFORE plus a re-swap of production views — sub-minute recovery, audited in DEPLOY_STATE.

Curriculum · 4 modules · 20-26 hours

Module 01 is free. The rest unlocks with PRO.

Try Module 01 — set up the Git foundation, configure dev/staging/prod profiles, install pre-commit, validate environment parity. About 4 hours. If it clicks, upgrade to unlock the CI, blue/green, and Terraform modules.

P12 · 20-26 hours · 4 modules

Free preview PRO required

Module 01 is free — no card required. Lay the Git + environment foundation before paying.

M01

✓Foundation: Git workflow + dev/staging/prod environments

Trunk-based Git for data teams. Data-aware .gitignore + .gitattributes (Git LFS for seed CSVs). CODEOWNERS + PR template + branch protection. Pre-commit hooks (SQLFluff, dbt compile, secret scan). dbt profiles.yml with isolated dev/staging/prod schemas. validate_environments.py for parity checks.

4-5h6 lessonsFREE PREVIEW

Start →

M02

⊘Automation: dbt CI on GitHub Actions + quality gates

dbt test suite (generic + custom + singular tests with reconciliation). GitHub Actions dbt-ci.yml running dbt build on PR. PR-scoped CI schemas + cleanup workflow on PR close. Slim CI with state:modified+ and manifest caching. Test-result reporting back to PR comments.

5-6h8 lessonsPRO TIER

Unlock with PRO →

M03

⊘Deployment: blue/green view swaps + time-travel rollback

BLUE/GREEN schema pattern + DEPLOY_STATE audit table. BlueGreenDeployer Python class. Pre-cutover validation (row count, null PKs, freshness, schema compatibility). Atomic view re-point inside BEGIN/COMMIT. Snowflake time-travel rollback (AT/BEFORE) and the orchestrator that ties build → validate → swap → notify together.

5-7h8 lessonsPRO TIER

Unlock with PRO →

M04

⊘Production: Terraform IaC + SLA monitoring + SEV runbooks

Terraform Snowflake provider — warehouses with auto-suspend tiering, role hierarchy (LOADER / TRANSFORMER / REPORTER), grants per environment. Multi-stage Dockerfile + docker-compose for local parity. Freshness SLA checks + row-count anomaly macro. SEV runbook templates and the deployment audit log model.

5-6h8 lessonsPRO TIER

Unlock with PRO →

3 modules locked · Unlock all PRO content for $29/mo

Upgrade to PRO →

Backed by curriculum

DataOps: CI/CD & Infrastructure as Code

8 modules·14 hours·Git for data·PR-scoped CI·blue/green·rollback·Terraform

Open curriculum→

This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.

The build, in 3 phases

Three sprints. Three checkpoints. One production-grade pipeline.

Each phase ends with a tagged commit and a runnable artifact. No ambiguity about where you are.

01~9h

Foundation + automation

Trunk-based Git with pre-commit + branch protection (Module 01). dbt CI on GitHub Actions with PR-scoped schemas + Slim CI cutting full-build time to a selective <5-min run (Module 02).

✓Pre-commit + CODEOWNERS + dev/staging/prod profiles
✓dbt CI workflow with PR-scoped schemas
✓Slim CI on state:modified+ with manifest cache

02~6h

Blue/green deploy + rollback

BLUE/GREEN schema pattern with atomic view swap inside BEGIN/COMMIT. Pre-cutover checks. Time-travel rollback + DEPLOY_STATE audit. Orchestrator chains the whole pipeline.

✓BlueGreenDeployer + DEPLOY_STATE table
✓Pre-cutover validation framework
✓Time-travel rollback + orchestrator.py

03~6h

Terraform + SLA hardening

Snowflake-as-code via Terraform: warehouses, RBAC role hierarchy, grants per env. Multi-stage Docker image. Freshness SLA + row-count anomaly checks. SEV runbooks + deployment audit model.

✓terraform/ for warehouses + roles + grants
✓Multi-stage Dockerfile + docker-compose
✓Freshness SLA + SEV runbooks + audit model

Project setup · 10 minutes

One repo. dbt + Snowflake + GitHub Actions + Terraform.

The starter kit ships the full PipelineOps Inc repo — dbt project skeleton, GitHub Actions workflows, blue/green Python orchestrator, Terraform modules, sample CSVs with intentional data-quality issues, and the seed schema for Snowflake.

What lives in the repo

Everything you need to build the platform under the platform — dbt, CI workflows, deploy scripts, IaC, and seed data with deliberate quality bugs so your CI gates catch real failures.

models/ + tests/ — staging/intermediate/marts dbt project with generic, custom, and singular tests
.github/workflows/ — dbt-ci.yml, dbt-slim-ci.yml, env-parity-check.yml, schema cleanup, manifest cache
deploy/ — blue_green.py, pre_cutover_checks.py, rollback.py, backfill.py, orchestrator.py
terraform/ — warehouses.tf, roles.tf, grants per dev/staging/prod with remote state backend
Dockerfile + docker-compose.yml — multi-stage build (Python 3.11 slim) + local-parity stack
runbooks/ + scripts/ — SEV templates + check_freshness.py + chaos_test.py skeletons

Download · Starter Kit

CI/CD Data Platform Starter Kit

Pre-built repo with dbt project skeleton, GitHub Actions workflows, blue/green orchestrator, Terraform modules, SEV runbook templates, and 14,800 rows of seed data with intentional QA issues.

Pro project · ~250 KB · 73 files · 4 sample CSVs (14.8k rows total)

~/projects/cicd-data-platform — zsh

1. Clone and install pre-commit hooks

$ git clone github.com/ai-de/p12-cicd-data-platform

$ cd p12-cicd-data-platform && pre-commit install

2. Configure dbt profiles + verify environment parity

$ cp profiles.yml.example ~/.dbt/profiles.yml

$ python scripts/validate_environments.py --target dev

3. Run the dbt CI suite locally (matches what GitHub Actions runs)

$ dbt seed --target ci

$ dbt build --target ci --select state:modified+ --state ./target

4. Provision Snowflake objects via Terraform

$ cd terraform && terraform init

$ terraform plan -var-file=dev.tfvars

$ terraform apply -var-file=dev.tfvars

5. Run a blue/green deploy + simulate rollback

$ python deploy/orchestrator.py --target green --git-sha $(git rev-parse HEAD)

$ python deploy/rollback.py --deploy-id <id> --strategy view-swap

1.5k

customers

300

products

orders

events

What changes vs a hand-deployed dbt project

The same dbt project — but with the safety net a 15-engineer team needs.

Most teams ship dbt by hand: dbt run --target prod from a laptop, no review, no rollback. The patterns in this project — state:modified+, atomic view swaps, time travel, Terraform-managed RBAC — are what unlocks safe parallelism without breaking the warehouse.

Hand-deployed versionWhat most teams have today

Deploys

dbt run --target prod from a laptop

Tests

Run locally, sometimes, when remembered

PR validation

Manual SQL review, no CI

Rollback

git revert + redeploy + cross fingers

Schema isolation

Single PROD schema for everything

Snowflake config

Click-Ops in the UI; drift everywhere

Incident response

Slack message; no runbook, no SLO

Your CI/CD platform versionModule 02–04

✓

Deploys

GitHub Actions runs dbt build --target prod on merge, audit-logged

✓

Tests

Generic + custom + singular tests run on every PR, blocking merges

✓

PR validation

dbt build in PR-scoped CI_PR_N schema, cleaned on close

✓

Rollback

View re-swap + AT/BEFORE time travel — sub-minute, recorded in DEPLOY_STATE

✓

Schema isolation

BLUE/GREEN schemas; production reads see only the active one

✓

Snowflake config

Terraform modules for warehouses, roles, grants — plan/apply per env

✓

Incident response

SEV1/SEV2 runbooks + freshness SLA checks + deployment audit log

PRO benefit · code review

Real review from senior engineers who shipped this stack.

Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.

4 reviews / month

Submit a repo, a PR, or a Terraform plan. Reviewer is matched to your domain — dbt + GitHub Actions + Snowflake for this project. Async, comments inline, average turnaround 31 hours.

31h

avg turnaround

9.2/10

helpfulness

94%

return next month

2 office hours / month

Live 30-min sessions with a senior platform engineer. Architecture questions, walk a tricky deploy migration, mock a system-design interview on CI/CD for data. Group sessions also available.

30 min

per session

2 / mo

included

+ group

unlimited

What PRO unlocks

One subscription. 15+ projects, all curriculum, code review.

PRO is built for engineers who want production-grade builds and feedback loops — not more tutorials.

What you getFREEPROEXPERT

Projects

Production-grade builds

15+

Curriculum modules

All 7 tracks

Phase 1 only

All

All + bonus

Code review credits

Senior engineer review

4 / month

Unlimited

Career path access

5 paths × full plans

1 path

All 5

All 5 + 1:1

Certificate

Verifiable on LinkedIn

—

Yes

Yes + portfolio review

Community

Discord + office hours

Read-only

Full + 2/mo

Full + 4/mo

$29/mo

billed monthly · cancel anytime

or annual

$249/yr save 28%

Upgrade to PRO →

Who this is for

Pick this if your dbt project is load-bearing for someone else’s revenue.

Data engineers going senior+

You ship dbt every day but deploys are still hand-rolled. This is the project that turns 'I can write models' into 'I can run a 15-engineer dbt team safely.'

Analytics engineers ready to own delivery

You can build the marts. Now you want to own the pipeline that ships them — PRs, CI, blue/green — without waiting for a platform team to build it for you.

Platform engineers entering data

You've shipped CI/CD for app code. Data is a new shape — atomic view swaps instead of containers, time travel instead of revert. This translates the patterns.

Eng managers and tech leads

You're the one writing the SLA and the SEV runbook. This gives you the reference architecture and the language to argue for it in your roadmap reviews.

Related curriculum

Going deeper? Three tracks back this project.

CI/CD is the spine of this project. These three curriculums let you go deeper on the layers that matter most — the dbt models you're shipping, the freshness checks you're alerting on, and the governance posture that proves the platform is safe.

FAQ

Quick answers.

Does this include Kubernetes, Helm, or a full Airflow deployment?+

No — and we used to advertise that, which was misleading. The deploy substrate in this project is GitHub Actions + Docker + Terraform-managed Snowflake. There's an Airflow DAG-versioning pattern shown in Module 03, but no scheduler/worker setup. Kubernetes and Helm aren't in scope; they get a dedicated platform project later in the catalog.

Is the '99.9% SLA' actually achieved in this project?+

No — it's the SLO target the runbooks are designed around. Module 04 walks you through the freshness checks, row-count anomalies, and SEV escalation tiers that a 99.9%-targeted platform needs. The project doesn't run a 30-day production load to measure uptime — that's not a tutorial-scale exercise.

How is this different from /projects/dataguard-observability?+

DataGuard goes deep on the observability layer (freshness, anomaly, lineage, alerting). This project goes deep on the deploy spine (PRs, CI, blue/green, rollback, IaC). They sit next to each other — your CI/CD platform is what catches breaking changes before they ship; DataGuard is what catches breakages after they ship. Both are in the platform track and most senior+ engineers do both.

Do I need a Snowflake account?+

Helpful but not required. Module 01 walks setup (free Snowflake trial works), and the dbt + Python deploy code is testable locally with the Docker Compose stack. You won't get the full blue/green time-travel experience without a Snowflake account, but you can build and test everything else.

Are the GitHub Actions workflows real or just templates?+

Real. Module 02 ships dbt-ci.yml, dbt-slim-ci.yml, an env-parity-check workflow, a PR-scoped schema cleanup workflow, and a manifest-cache workflow — all runnable in your fork. They're tutorial-scoped (small repo, single Snowflake account) but the patterns transfer directly to a 150-model production repo.

What does PRO actually unlock for $29/mo?+

All 15+ PRO projects, 4 code-review credits per month, 2 office-hours sessions, full curriculum across all 7 tracks, all 5 career paths, certificate of completion, and full community access. Cancel anytime.

Ready to build the platform under the platform?

Start with Module 01 — free, no card. About 4 hours. By the end you'll have trunk-based Git, pre-commit hooks, dev/staging/prod dbt profiles, and a parity-validated environment foundation. If it clicks, upgrade to unlock CI, blue/green, and Terraform modules.

See PRO benefits

P12 · CI/CD Data Platform · PRO · module 01 freeUpgrade to PRO →

Build theplatform underthe platform — CI/CD for data

The #1 skill gap between data engineer and platform engineer.

PRs, not laptop pushes

Slim CI keeps PRs <5 min

Zero-downtime deploys

Rollback you can trust

Module 01 is free. The rest unlocks with PRO.

DataOps: CI/CD & Infrastructure as Code

Three sprints. Three checkpoints. One production-grade pipeline.

One repo. dbt + Snowflake + GitHub Actions + Terraform.

What lives in the repo

CI/CD Data Platform Starter Kit

The same dbt project — but with the safety net a 15-engineer team needs.

Real review from senior engineers who shipped this stack.

4 reviews / month

2 office hours / month

One subscription. 15+ projects, all curriculum, code review.

Pick this if your dbt project is load-bearing for someone else’s revenue.

Data engineers going senior+

Analytics engineers ready to own delivery

Platform engineers entering data

Eng managers and tech leads

Going deeper? Three tracks back this project.

dbt & Analytics Engineering

Data Observability & Quality

Governance & Data Contracts

Quick answers.

Ready to build the platform under the platform?

Build the
platform under
the platform — CI/CD for data