Path 02 · +3x YoY hiring

Feed productionLLMs and ML models.

Vector retrieval, feature stores, LLM batch enrichment, evaluation pipelines. The infra that makes AI actually work in production.

See the path

After this path

What you'll actually be able to do.

Not "you'll know about Airflow." What you'll ship, debug, and defend in an interview.

You’ll be the person who

Build retrieval that actually returns the right thing
Design feature stores with point-in-time correctness
Run LLM batch enrichment with cost + retry control
Evaluate retrieval with recall@k, not vibes
Bridge the data team and the ML team

And the market pays you for

Build retrievalNot vibes-based

Design feature storesPoint-in-time correct

Run LLM pipelinesWith cost + retries

Evaluate rigorouslyRecall@k, not vibes

System architecture

The system you'll build by the end.

A production reference architecture — not a toy demo. Every node maps to a course or project in this path.

01 · Source

Postgres

Events

Docs

02 · Embed

OpenAI

bge-m3

Cohere

03 · Store

pgvector

Qdrant

Feast

04 · Retrieve

Hybrid BM25+ANN

Rerank

05 · Serve

LLM API

Eval loop

Feedback

Orchestration: Ray + AirflowEvery node → a course + project

Your path

From week one to capstone.

A realistic 5-stage timeline. Go faster if you already have pieces; slower if you're brand new.

01Week 1–2
DE basics (fast)
SQL, Airflow, containers — skip if you have it
Study
sql-mastery python-de airflow
02Week 3–6
Batch + streaming
The pipelines that feed ML
Study
dbt spark kafka-streams
Ship
ecommerce-data-warehouse
03Week 7–12
AI & Vectors
Embeddings, hybrid retrieval, evaluation
Study
vector-databases rag llm-evaluation dataset-engineering
Ship
ai-retrieval-platform llm-evaluation-framework
04Week 13–17
Feature stores
Feast, offline/online parity, drift
Study
feature-stores mlops llm-pipeline
Ship
predictflow-feature-store llm-ingestion-pipeline
05Week 18–20
Ship capstone
Vector search + LLM enrichment on real data
Ship
enterprise-rag

Capstone project

One project, endlessly talkable.

Every path ends with a flagship capstone you'll ship, write up, and walk through in every interview loop.

P06 · CapstoneEnterprise RAG platform

The capstone that gets you the AI-era role.

pgvectorQdrantFastAPIRedisRayOpenAI

Plus projects

predictflow-feature-store llm-evaluation-framework llm-ingestion-pipeline

What you’ll ship

01Index 5M docs with bge-m3 embeddings
02Hybrid BM25 + ANN retrieval
03Rerank top-100 → top-10 with cross-encoder
04Recall@k eval on a human-labeled set
05API served behind FastAPI + Redis cache

Proof

Questions you'll confidently answer.

These are real interview questions for AI Data Engineer roles. If you can answer all four with a story from your capstone, you're ready.

How would you evaluate whether retrieval is getting better?

Why not just use cosine similarity? When does BM25 still win?

Design a feature store with point-in-time correctness

How do you batch-enrich 10M rows with an LLM on a budget?

Why this matters: Most courses let you hide behind passive video-watching. ai-de projects force you into the exact failure modes interviewers probe for — so when you sit in the interview, you’ve already lived the answer.

Skills · syllabus

Stack you'll learn.

Not memorized — operated. Each tool is taught inside a project, not an isolated lecture.

EmbeddingspgvectorQdrantFeastRayLLM APIsRetrieval evalSparkFastAPI

Not quite the right fit? Explore the other 4 paths

The core pathCore Data EngineerGo Senior / staff trackData Platform EngineerGo Closest to the businessAnalytics / Product Data EngineerGo Senior+ AI infraAI Platform EngineerGo

Your move

Start building your first system — today.

Module 01 is free. No card. Ship something real this weekend.

Compare all 5 paths

AI Data Engineer · free module 01 · no card

Feed productionLLMs and ML models.

What you'll actually be able to do.

You’ll be the person who

And the market pays you for

The system you'll build by the end.

From week one to capstone.

DE basics (fast)

Batch + streaming

AI & Vectors

Feature stores

Ship capstone

One project, endlessly talkable.

What you’ll ship

Questions you'll confidently answer.

Stack you'll learn.

Start building your first system — today.