Skip to content
ai-de.net/Projects/P01 · Flink Fraud Detection
PRO · module 01 free previewStreaming trackP01

Build a
production-grade
fraud detection pipeline on Apache Flink

Run Flink + Kafka locally with Docker. Ship 5 stateful detectors over event-time windows, exactly-once writes into Kafka via two-phase commit, and a Flink Kubernetes Operator deployment with ZooKeeper HA and RocksDB incremental checkpoints — no managed streaming service required.

Timeline
16-20 hours
Difficulty
Senior+
Stack
Flink · Kafka · RocksDB · Kubernetes

This is the system-design question asked at Stripe, Block, Adyen, and Plaid — and any team running real-time decisions on streaming events.

By the end you will have
  • A working Flink + Kafka stack running locally (Docker Compose with Zookeeper, Kafka, JM, TM, Postgres)
  • 5 stateful fraud detectors: HighValue, Velocity (tumbling), SpendingVelocity (sliding), CustomerProfile (Z-score), RapidEscalation (timer-driven)
  • Exactly-once Kafka sink via DeliveryGuarantee.EXACTLY_ONCE + transactional ID prefix
  • Elasticsearch bulk sink with daily indices and idempotent doc IDs
  • ProcessFunctionTestHarness unit-test scaffolding
  • Flink Kubernetes Operator FlinkDeployment CRD with ZK 3-node HA + RocksDB incremental → S3
PREREQBuilt for senior+ engineers. Comfortable with Java basics (Java 17 + Maven), Docker Compose, and Kafka concepts (topics, partitions, offsets). Not a streaming primer — assumes you’ve operated a producer/consumer in production before.
fraud-detection · transactions → fraud-alerts
exactly-once
Docker stack
Kafka topics
Flink operators
Sinks
kafka
zookeeper
flink-jm
flink-tm
postgres
docker compose up
transactions5,000 events · seeded fraud
customers500 profiles · postgres seed
rules20 rules · broadcast state
input backbone · read_committed
HighValueDetectorProcessFunction · M01
VelocityDetectortumbling 1 min
SpendingVelocityDetectorsliding 1h / 5m
CustomerProfileFunctionValueState · Z-score
RapidEscalationDetectortimers · cleanup
keyBy(customerId)
fraud-alertsKafka · EOS
fraud-alertsES · daily idx
dlqside output
idempotent IDs
downstream tee
# Exactly-once via 2PC
KafkaSink.builder()
.setDeliveryGuarantee(EXACTLY_ONCE)
barrier · precommit · commit on confirm
→ no duplicate fraud alerts on rebalance
● RocksDB → S3 incremental
state.backend: rocksdb
state.backend.incremental: true
savepoint-driven upgrades on K8s
→ Flink Operator + ZK HA in production
5
Stateful detectors
EOS
Via 2PC into Kafka
K8s
Operator-deployed
Why this matters in 2026

Streaming is the default for fraud, risk, and routing.

Batch nightly scoring is no longer acceptable for transactional risk. The patterns you ship in this project — keyed state, event-time windows, two-phase commit — are the ones in production at the companies setting the bar for real-time decisioning.

Exactly-once is table-stakes

Duplicate alerts cost real money — chargebacks, false declines, eroded trust. Flink’s 2PC + transactional Kafka sinks are the reference implementation; this project walks the barrier protocol end-to-end.

Keyed state replaces nightly batch

Velocity, escalation, and per-customer anomaly used to be Spark jobs at midnight. Keyed ValueState + timers move them inline — sub-window decisioning, not next-day reports.

K8s Operator pattern

Declarative streaming infra. The Flink Kubernetes Operator turns a streaming job into a CRD; savepoints become git-managed artifacts; failover is declarative.

Interview signal

Senior+ data and backend roles at fintech expect you to whiteboard watermarks, exactly-once delivery, and stateful processing. The right vocabulary, with code to back it.

Curriculum · 4 modules · 16-20 hours

Module 01 is free. The rest unlocks with PRO.

Try the first 3-4 hours — stand up the local Flink + Kafka stack, write your first DataStream pipeline, ship a unit-tested ProcessFunction with side outputs. If it clicks, upgrade to unlock stateful detection, exactly-once connectors, and the K8s deployment modules.

P01 · 16-20 hours · 4 modules
Free preview PRO required
Module 01 is free — no card required. Get a feel for the stack before paying.
M01
Foundation: Flink + Kafka + DataStream basics
Stand up Flink + Kafka with Docker Compose. Build your first DataStream pipeline with map / filter / keyBy. Implement a stateless HighValue detector as a ProcessFunction with side outputs. Cover it with a ProcessFunctionTestHarness unit test.
3-4h9 lessonsFREE PREVIEW
Start →
M02
Time, windows, state — 4 stateful detectors
Master event-time processing with bounded-lateness watermarks. Build Velocity (tumbling 1-min), SpendingVelocity (sliding 1h / 5-min slide), CustomerProfile (ValueState Z-score anomaly), and RapidEscalation (KeyedProcessFunction with timer-driven cleanup). Tune the RocksDB state backend.
5-6h12 lessonsPRO TIER
Unlock with PRO →
M03
Connectors: exactly-once Kafka + Elasticsearch
Wire a tuned Kafka source (fetch sizing, session timeout). Ship the alert sink with DeliveryGuarantee.EXACTLY_ONCE, transactional ID prefix, and read_committed consumers. Add an Elasticsearch bulk sink with daily indices and idempotent doc IDs. Walk the two-phase commit barrier protocol.
4-5h11 lessonsPRO TIER
Unlock with PRO →
M04
Kubernetes: Flink Operator, ZK HA, RocksDB → S3
Deploy via the Flink Kubernetes Operator (Application Mode + FlinkDeployment CRD). Wire ZooKeeper 3-node HA and incremental RocksDB checkpoints to S3. Configure the Prometheus reporter on port 9249. Drive savepoint-based upgrades via kubectl annotation.
4-5h11 lessonsPRO TIER
Unlock with PRO →
3 modules locked · Unlock all PRO content for $29/mo
Upgrade to PRO →
Backed by curriculum

Apache Flink & Stream Processing

9 modules·~14 hours·DataStream API·watermarks·keyed state·exactly-once·K8s Operator
Open curriculum

This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.

The build, in 3 phases

Three sprints. Three checkpoints. One production fraud pipeline.

Each phase ends with a tagged commit and a working artifact. No ambiguity about where you are.

01~4h
Stand up the streaming foundation

Flink + Kafka running locally via Docker. First DataStream pipeline reading transactions and emitting to stdout. One stateless detector covered by a unit test.

  • Docker Compose stack (Zookeeper, Kafka, Flink JM/TM, Postgres)
  • TransactionConsumerJob + Transaction POJO + KafkaSource
  • HighValueDetector ProcessFunction + ProcessFunctionTestHarness test
02~6h
Stateful detection over event time

Watermarks, tumbling and sliding windows, KeyedProcessFunction with timers, ValueState Z-score profiles, late-data side outputs — 4 stateful detectors wired and explained.

  • VelocityDetector + SpendingVelocityDetector (tumbling + sliding)
  • CustomerProfileFunction (ValueState Z-score)
  • RapidEscalationDetector + LateDataHandler (timers + sideOutputLateData)
03~7h
Production envelope: exactly-once + K8s

Transactional Kafka sink with EOS, Elasticsearch bulk sink, Flink K8s Operator deployment with ZK HA and RocksDB → S3. The job that runs in module 01 — but deployed the way fintech runs streaming.

  • Kafka EOS sink + Elasticsearch bulk sink + schema-evolution deserializer
  • FlinkDeployment CRD + ZK 3-node HA + incremental RocksDB → S3
  • Prometheus reporter + savepoint-driven upgrade workflow
Project setup · 10 minutes

One command. Local Flink + Kafka + Postgres stack.

You get a real stack on day one — Kafka with Zookeeper, Flink JM/TM, Postgres seed, and 5,000 synthetic transactions ready to replay (with seeded fraud patterns).

What lives in the repo

Everything you need to stand up a production-shaped streaming stack on your laptop, plus the seed data and starter implementations referenced in modules 01–04.

  • docker-compose.yml — Zookeeper, Kafka, Flink JM/TM, Postgres
  • src/main/java/.../jobs/ — TransactionConsumerJob + FraudDetectionJob skeletons
  • src/main/java/.../detectors/ — HighValue, Velocity, SpendingVelocity, CustomerProfile, RapidEscalation
  • src/main/java/.../sinks/ — AlertKafkaSink + ElasticsearchAlertSink
  • src/test/java/ — ProcessFunctionTestHarness scaffolding
  • k8s/ — FlinkDeployment CRD + ZK StatefulSet reference manifests
  • fixtures/ — transactions.jsonl (5,000 events), customers.csv (500 rows), rules.json (20 broadcast rules), expected_alerts.jsonl (298 alerts)
Download · Starter Kit

Flink Fraud Detection Starter Kit

Maven Java 17 project (Flink 1.20), Docker Compose stack, replay producer, pre-built JSONL fixtures, and reference K8s manifests. Skip the boilerplate, start on module 01.

~290 KB · 63 files · seeded fraud fixtures · PRO required
~/projects/flink-fraud-detection — zsh
1. Clone and start the stack
$ git clone github.com/ai-de/p01-flink-fraud-detection
$ cd p01-flink-fraud-detection && docker compose up -d
2. Create Kafka topics + replay fixtures
$ make topics # transactions, fraud-alerts, dlq
$ make replay # streams 5,000 fixtures into transactions
3. Build and submit the Flink job
$ mvn package -DskipTests
$ flink run -d target/fraud-detection-1.0.jar
4. Verify alerts on the Kafka sink topic
$ docker exec kafka kafka-console-consumer.sh \
$ --bootstrap-server localhost:9092 \
$ --topic fraud-alerts --from-beginning | head
5,000
Replay transactions
500
Customer profiles
20
Broadcast rules
298
Expected alerts (regression baseline)
Production hardening

The same pipeline — but built for the 10x case.

Most Flink tutorials show you the DataStream. This one shows what changes when checkpoints fail mid-window, the alert topic is consumed by three teams, and the rules need to update without a redeploy.

Tutorial-grade versionWhat you have today
×
Checkpoint storage
file:///tmp/flink-checkpoints
×
Alert sink
alerts.print() to stdout
×
Delivery guarantee
Default at-least-once Kafka producer
×
State backend
Heap state — fine for 10MB, breaks at 10GB
×
Bad records
Deserializer returns null and drops them silently
×
Job upgrade
Stop the job, lose state, hope for the best
×
Rule updates
Hardcoded thresholds — redeploy to change a number
Your production versionModule 03–04
Checkpoint storage
s3://... with state.backend.incremental: true
Alert sink
KafkaSink with EXACTLY_ONCE + transactional ID prefix
Delivery guarantee
2PC barrier — pre-commit on checkpoint, commit on confirm; read_committed consumers
State backend
EmbeddedRocksDBStateBackend with managed memory + tuned write buffer
Bad records
Side output to dlq Kafka topic + alert metric
Job upgrade
kubectl annotate savepoint trigger — atomic, replayable
Rule updates
Broadcast state from rules Kafka topic — hot reload without redeploy
PRO benefit · code review

Real review from senior engineers who shipped this stack.

Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.

CR

4 reviews / month

Submit a repo, a PR, or a refactor proposal. Reviewer is matched to your domain — Flink/Kafka for this project. Async, comments inline, average turnaround 31 hours.

31h
avg turnaround
9.2/10
helpfulness
94%
return next month
OH

2 office hours / month

Live 30-min sessions with a senior streaming engineer. Architecture questions, whiteboard a tricky watermark, mock a system-design interview. Group sessions also available.

30 min
per session
2 / mo
included
+ group
unlimited
What PRO unlocks

One subscription. 15+ projects, all curriculum, code review.

PRO is built for senior+ engineers who want production-grade builds and feedback loops — not more tutorials.

What you getFREEPROEXPERT
Projects
Production-grade builds
2
15+
8
Curriculum modules
All 7 tracks
Phase 1 only
All
All + bonus
Code review credits
Senior engineer review
0
4 / month
Unlimited
Career path access
5 paths × full plans
1 path
All 5
All 5 + 1:1
Certificate
Verifiable on LinkedIn
Yes
Yes + portfolio review
Community
Discord + office hours
Read-only
Full + 2/mo
Full + 4/mo
$29/mo
billed monthly · cancel anytime
or annual
$249/yr save 28%
Upgrade to PRO
Who this is for

Pick this if you’re shipping at scale, not learning to.

DE

Data engineers moving from batch to streaming

You've shipped dbt + Airflow pipelines. You want to know what changes when the data is unbounded — watermarks, state, exactly-once, recovery.

BE

Backend engineers interviewing at fintech

Stripe, Block, Adyen, and Plaid expect you to whiteboard streaming fraud. After this you can talk through 2PC, keyed state, and operator deployment without hand-waving.

PE

Platform engineers scoping streaming infra

You're sizing a real-time alerting workload for the org. You need to understand the failure modes, the K8s Operator pattern, and what to put behind a service before signing off.

ST

Staff / tech leads

You're driving the move off nightly batch risk scoring. You need to understand watermarks, late data, exactly-once trade-offs, and the operational footprint before the rest of the team commits.

FAQ

Quick answers.

We don't ship a Grafana dashboard JSON, GitHub Actions CI/CD workflow, async-I/O ML scorer, or a Debezium PostgreSQL CDC source. The starter scaffolds extension hooks for those (broadcast-state pattern, side outputs, Prometheus reporter) so you can wire them in your own deployment — but the project's commitment is the streaming + state + exactly-once + K8s spine.
P02 is a system-design portfolio (no code, 69 artifacts, the Uber event platform redesign). This is hands-on Java code on Flink's DataStream API. P02 teaches you to defend a design; this teaches you to ship one.
No for parts 1–3 (local Docker Compose: Kafka + Flink + Postgres). Part 4 K8s deployment uses reference manifests — you can run it on Kind locally, or skip the deploy step and read the FlinkDeployment CRD as a system-design artifact.
Parts 1–3 yes, ~6GB RAM is comfortable (Kafka + 2 Flink containers + Postgres). Part 4 K8s requires Kind or a real cluster — the manifests assume S3 + ZK persistent storage.
Both. The code ships 5 stateful detector classes (HighValue, Velocity, SpendingVelocity, CustomerProfile Z-score, RapidEscalation) — those are the patterns you learn. The starter zip ships 20 broadcast-state rule definitions in rules.json — the catalog you'd hot-reload from a Kafka topic in production.
All 15+ PRO projects, 4 code-review credits per month, 2 office-hours sessions, full curriculum across all 7 tracks, all 5 career paths, certificate of completion, and full community access. Cancel anytime.
Yes. Real-time stream processing is a top topic for senior data and backend roles, especially at fintech. After this you can whiteboard watermarks, exactly-once via 2PC, keyed state vs broadcast state, and the trade-offs between Application Mode and Session Mode on K8s.

Ready to ship a real streaming pipeline?

Start with module 01 — free, no card. About 3-4 hours. By the end you'll have Flink + Kafka running locally and a unit-tested HighValue detector emitting through a side output channel.

P01 · Flink Fraud Detection · PRO · module 01 freeUpgrade to PRO →
Press Cmd+K to open