Build a
production-grade
fraud detection pipeline on Apache Flink
Run Flink + Kafka locally with Docker. Ship 5 stateful detectors over event-time windows, exactly-once writes into Kafka via two-phase commit, and a Flink Kubernetes Operator deployment with ZooKeeper HA and RocksDB incremental checkpoints — no managed streaming service required.
This is the system-design question asked at Stripe, Block, Adyen, and Plaid — and any team running real-time decisions on streaming events.
- A working Flink + Kafka stack running locally (Docker Compose with Zookeeper, Kafka, JM, TM, Postgres)
- 5 stateful fraud detectors: HighValue, Velocity (tumbling), SpendingVelocity (sliding), CustomerProfile (Z-score), RapidEscalation (timer-driven)
- Exactly-once Kafka sink via DeliveryGuarantee.EXACTLY_ONCE + transactional ID prefix
- Elasticsearch bulk sink with daily indices and idempotent doc IDs
- ProcessFunctionTestHarness unit-test scaffolding
- Flink Kubernetes Operator FlinkDeployment CRD with ZK 3-node HA + RocksDB incremental → S3
Streaming is the default for fraud, risk, and routing.
Batch nightly scoring is no longer acceptable for transactional risk. The patterns you ship in this project — keyed state, event-time windows, two-phase commit — are the ones in production at the companies setting the bar for real-time decisioning.
Exactly-once is table-stakes
Duplicate alerts cost real money — chargebacks, false declines, eroded trust. Flink’s 2PC + transactional Kafka sinks are the reference implementation; this project walks the barrier protocol end-to-end.
Keyed state replaces nightly batch
Velocity, escalation, and per-customer anomaly used to be Spark jobs at midnight. Keyed ValueState + timers move them inline — sub-window decisioning, not next-day reports.
K8s Operator pattern
Declarative streaming infra. The Flink Kubernetes Operator turns a streaming job into a CRD; savepoints become git-managed artifacts; failover is declarative.
Interview signal
Senior+ data and backend roles at fintech expect you to whiteboard watermarks, exactly-once delivery, and stateful processing. The right vocabulary, with code to back it.
Module 01 is free. The rest unlocks with PRO.
Try the first 3-4 hours — stand up the local Flink + Kafka stack, write your first DataStream pipeline, ship a unit-tested ProcessFunction with side outputs. If it clicks, upgrade to unlock stateful detection, exactly-once connectors, and the K8s deployment modules.
Apache Flink & Stream Processing
This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.
Three sprints. Three checkpoints. One production fraud pipeline.
Each phase ends with a tagged commit and a working artifact. No ambiguity about where you are.
Flink + Kafka running locally via Docker. First DataStream pipeline reading transactions and emitting to stdout. One stateless detector covered by a unit test.
- ✓Docker Compose stack (Zookeeper, Kafka, Flink JM/TM, Postgres)
- ✓TransactionConsumerJob + Transaction POJO + KafkaSource
- ✓HighValueDetector ProcessFunction + ProcessFunctionTestHarness test
Watermarks, tumbling and sliding windows, KeyedProcessFunction with timers, ValueState Z-score profiles, late-data side outputs — 4 stateful detectors wired and explained.
- ✓VelocityDetector + SpendingVelocityDetector (tumbling + sliding)
- ✓CustomerProfileFunction (ValueState Z-score)
- ✓RapidEscalationDetector + LateDataHandler (timers + sideOutputLateData)
Transactional Kafka sink with EOS, Elasticsearch bulk sink, Flink K8s Operator deployment with ZK HA and RocksDB → S3. The job that runs in module 01 — but deployed the way fintech runs streaming.
- ✓Kafka EOS sink + Elasticsearch bulk sink + schema-evolution deserializer
- ✓FlinkDeployment CRD + ZK 3-node HA + incremental RocksDB → S3
- ✓Prometheus reporter + savepoint-driven upgrade workflow
One command. Local Flink + Kafka + Postgres stack.
You get a real stack on day one — Kafka with Zookeeper, Flink JM/TM, Postgres seed, and 5,000 synthetic transactions ready to replay (with seeded fraud patterns).
What lives in the repo
Everything you need to stand up a production-shaped streaming stack on your laptop, plus the seed data and starter implementations referenced in modules 01–04.
- docker-compose.yml — Zookeeper, Kafka, Flink JM/TM, Postgres
- src/main/java/.../jobs/ — TransactionConsumerJob + FraudDetectionJob skeletons
- src/main/java/.../detectors/ — HighValue, Velocity, SpendingVelocity, CustomerProfile, RapidEscalation
- src/main/java/.../sinks/ — AlertKafkaSink + ElasticsearchAlertSink
- src/test/java/ — ProcessFunctionTestHarness scaffolding
- k8s/ — FlinkDeployment CRD + ZK StatefulSet reference manifests
- fixtures/ — transactions.jsonl (5,000 events), customers.csv (500 rows), rules.json (20 broadcast rules), expected_alerts.jsonl (298 alerts)
Flink Fraud Detection Starter Kit
Maven Java 17 project (Flink 1.20), Docker Compose stack, replay producer, pre-built JSONL fixtures, and reference K8s manifests. Skip the boilerplate, start on module 01.
The same pipeline — but built for the 10x case.
Most Flink tutorials show you the DataStream. This one shows what changes when checkpoints fail mid-window, the alert topic is consumed by three teams, and the rules need to update without a redeploy.
s3://... with state.backend.incremental: trueKafkaSink with EXACTLY_ONCE + transactional ID prefixread_committed consumersEmbeddedRocksDBStateBackend with managed memory + tuned write bufferdlq Kafka topic + alert metrickubectl annotate savepoint trigger — atomic, replayablerules Kafka topic — hot reload without redeployReal review from senior engineers who shipped this stack.
Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.
4 reviews / month
Submit a repo, a PR, or a refactor proposal. Reviewer is matched to your domain — Flink/Kafka for this project. Async, comments inline, average turnaround 31 hours.
2 office hours / month
Live 30-min sessions with a senior streaming engineer. Architecture questions, whiteboard a tricky watermark, mock a system-design interview. Group sessions also available.
One subscription. 15+ projects, all curriculum, code review.
PRO is built for senior+ engineers who want production-grade builds and feedback loops — not more tutorials.
Pick this if you’re shipping at scale, not learning to.
Data engineers moving from batch to streaming
You've shipped dbt + Airflow pipelines. You want to know what changes when the data is unbounded — watermarks, state, exactly-once, recovery.
Backend engineers interviewing at fintech
Stripe, Block, Adyen, and Plaid expect you to whiteboard streaming fraud. After this you can talk through 2PC, keyed state, and operator deployment without hand-waving.
Platform engineers scoping streaming infra
You're sizing a real-time alerting workload for the org. You need to understand the failure modes, the K8s Operator pattern, and what to put behind a service before signing off.
Staff / tech leads
You're driving the move off nightly batch risk scoring. You need to understand watermarks, late data, exactly-once trade-offs, and the operational footprint before the rest of the team commits.
Going deeper? Three tracks back this project.
Flink fundamentals are the spine. These three curriculums let you go deeper on the layers that matter most — Kafka internals, streaming primitives, and event schema design.
Quick answers.
Ready to ship a real streaming pipeline?
Start with module 01 — free, no card. About 3-4 hours. By the end you'll have Flink + Kafka running locally and a unit-tested HighValue detector emitting through a side output channel.