Engineering Insights
Hard-won lessons, system design teardowns, and architecture guides from the frontlines of data engineering.
Why We Migrated from Airflow to Kubernetes-Native Orchestration
After three years running Airflow at scale, we hit the ceiling: resource contention, slow DAG parsing, and a scheduler that became a single point of failure. Here's the full story of how we rebuilt orchestration on Argo Workflows — what we gained, what we lost, and the lessons you can steal.
5 articles
The Reality of Streaming: When to Actually Use Apache Flink
Flink is extraordinarily powerful and extraordinarily complex. Most teams reach for it before they need it — and pay the operational price. Here's a framework for deciding when stream processing is justified, and when a micro-batch approach will serve you just as well.
Implementing Data Contracts in a dbt Monorepo
Data contracts promise to fix the silent breakage problem — upstream schema changes that quietly corrupt downstream reports. But the tooling is still maturing. Here's what actually worked for us: a lightweight contract layer built on dbt meta, JSON Schema, and a pre-merge CI check.
Building a Cost-Efficient RAG Pipeline with Pinecone
RAG pipelines can get expensive fast: embedding costs, vector storage costs, LLM inference costs. After running our internal knowledge base RAG in production for six months, here's what we optimized to cut costs by 70% without sacrificing retrieval quality.
Stop Building Toy Pipelines: The 2026 Data Engineering Portfolio Guide (with Code)
Hiring managers see hundreds of GitHub repos with a Jupyter notebook and a README promising an "end-to-end pipeline." They pass on all of them. Here's how to build a PySpark + dbt + Airflow portfolio project that demonstrates production-grade thinking — with full code.
Snowflake vs BigQuery in 2026: A Cost Analysis
We ran the same workload — 8 TB scanned daily, mixed ad-hoc and scheduled queries, three BI tools — on both Snowflake and BigQuery for 30 days. The winner depends heavily on your query patterns. Here are the numbers.