ai-de.net/Projects/P01 · Flink Fraud Detection

PRO · module 01 free previewStreaming trackP01

Build a
production-grade
fraud detection pipeline on Apache Flink

Run Flink + Kafka locally with Docker. Ship 5 stateful detectors over event-time windows, exactly-once writes into Kafka via two-phase commit, and a Flink Kubernetes Operator deployment with ZooKeeper HA and RocksDB incremental checkpoints — no managed streaming service required.

Timeline

16-20 hours

Difficulty

Senior+

Stack

Flink · Kafka · RocksDB · Kubernetes

See PRO benefits

This is the system-design question asked at Stripe, Block, Adyen, and Plaid — and any team running real-time decisions on streaming events.

By the end you will have

A working Flink + Kafka stack running locally (Docker Compose with Zookeeper, Kafka, JM, TM, Postgres)
5 stateful fraud detectors: HighValue, Velocity (tumbling), SpendingVelocity (sliding), CustomerProfile (Z-score), RapidEscalation (timer-driven)
Exactly-once Kafka sink via DeliveryGuarantee.EXACTLY_ONCE + transactional ID prefix
Elasticsearch bulk sink with daily indices and idempotent doc IDs
ProcessFunctionTestHarness unit-test scaffolding
Flink Kubernetes Operator FlinkDeployment CRD with ZK 3-node HA + RocksDB incremental → S3

PREREQBuilt for senior+ engineers. Comfortable with Java basics (Java 17 + Maven), Docker Compose, and Kafka concepts (topics, partitions, offsets). Not a streaming primer — assumes you’ve operated a producer/consumer in production before.

fraud-detection · transactions → fraud-alerts

exactly-once

Docker stack

Kafka topics

Flink operators

Sinks

kafka

zookeeper

flink-jm

flink-tm

postgres

docker compose up

transactions5,000 events · seeded fraud

customers500 profiles · postgres seed

rules20 rules · broadcast state

input backbone · read_committed

HighValueDetectorProcessFunction · M01

VelocityDetectortumbling 1 min

SpendingVelocityDetectorsliding 1h / 5m

CustomerProfileFunctionValueState · Z-score

RapidEscalationDetectortimers · cleanup

keyBy(customerId)

fraud-alertsKafka · EOS

fraud-alertsES · daily idx

dlqside output

idempotent IDs

downstream tee

# Exactly-once via 2PC

KafkaSink.builder()

.setDeliveryGuarantee(EXACTLY_ONCE)

barrier · precommit · commit on confirm

→ no duplicate fraud alerts on rebalance

● RocksDB → S3 incremental

state.backend: rocksdb

state.backend.incremental: true

savepoint-driven upgrades on K8s

→ Flink Operator + ZK HA in production

Stateful detectors

EOS

Via 2PC into Kafka

K8s

Operator-deployed

Why this matters in 2026

Streaming is the default for fraud, risk, and routing.

Batch nightly scoring is no longer acceptable for transactional risk. The patterns you ship in this project — keyed state, event-time windows, two-phase commit — are the ones in production at the companies setting the bar for real-time decisioning.

Exactly-once is table-stakes

Duplicate alerts cost real money — chargebacks, false declines, eroded trust. Flink’s 2PC + transactional Kafka sinks are the reference implementation; this project walks the barrier protocol end-to-end.

Keyed state replaces nightly batch

Velocity, escalation, and per-customer anomaly used to be Spark jobs at midnight. Keyed ValueState + timers move them inline — sub-window decisioning, not next-day reports.

K8s Operator pattern

Declarative streaming infra. The Flink Kubernetes Operator turns a streaming job into a CRD; savepoints become git-managed artifacts; failover is declarative.

Interview signal

Senior+ data and backend roles at fintech expect you to whiteboard watermarks, exactly-once delivery, and stateful processing. The right vocabulary, with code to back it.

Curriculum · 4 modules · 16-20 hours

Module 01 is free. The rest unlocks with PRO.

Try the first 3-4 hours — stand up the local Flink + Kafka stack, write your first DataStream pipeline, ship a unit-tested ProcessFunction with side outputs. If it clicks, upgrade to unlock stateful detection, exactly-once connectors, and the K8s deployment modules.

P01 · 16-20 hours · 4 modules

Free preview PRO required

Module 01 is free — no card required. Get a feel for the stack before paying.

M01

✓Foundation: Flink + Kafka + DataStream basics

Stand up Flink + Kafka with Docker Compose. Build your first DataStream pipeline with map / filter / keyBy. Implement a stateless HighValue detector as a ProcessFunction with side outputs. Cover it with a ProcessFunctionTestHarness unit test.

3-4h9 lessonsFREE PREVIEW

Start →

M02

⊘Time, windows, state — 4 stateful detectors

Master event-time processing with bounded-lateness watermarks. Build Velocity (tumbling 1-min), SpendingVelocity (sliding 1h / 5-min slide), CustomerProfile (ValueState Z-score anomaly), and RapidEscalation (KeyedProcessFunction with timer-driven cleanup). Tune the RocksDB state backend.

5-6h12 lessonsPRO TIER

Unlock with PRO →

M03

⊘Connectors: exactly-once Kafka + Elasticsearch

Wire a tuned Kafka source (fetch sizing, session timeout). Ship the alert sink with DeliveryGuarantee.EXACTLY_ONCE, transactional ID prefix, and read_committed consumers. Add an Elasticsearch bulk sink with daily indices and idempotent doc IDs. Walk the two-phase commit barrier protocol.

4-5h11 lessonsPRO TIER

Unlock with PRO →

M04

⊘Kubernetes: Flink Operator, ZK HA, RocksDB → S3

Deploy via the Flink Kubernetes Operator (Application Mode + FlinkDeployment CRD). Wire ZooKeeper 3-node HA and incremental RocksDB checkpoints to S3. Configure the Prometheus reporter on port 9249. Drive savepoint-based upgrades via kubectl annotation.

4-5h11 lessonsPRO TIER

Unlock with PRO →

3 modules locked · Unlock all PRO content for $29/mo

Upgrade to PRO →

Backed by curriculum

Apache Flink & Stream Processing

9 modules·~14 hours·DataStream API·watermarks·keyed state·exactly-once·K8s Operator

Open curriculum→

This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.

The build, in 3 phases

Three sprints. Three checkpoints. One production fraud pipeline.

Each phase ends with a tagged commit and a working artifact. No ambiguity about where you are.

01~4h

Stand up the streaming foundation

Flink + Kafka running locally via Docker. First DataStream pipeline reading transactions and emitting to stdout. One stateless detector covered by a unit test.

✓Docker Compose stack (Zookeeper, Kafka, Flink JM/TM, Postgres)
✓TransactionConsumerJob + Transaction POJO + KafkaSource
✓HighValueDetector ProcessFunction + ProcessFunctionTestHarness test

02~6h

Stateful detection over event time

Watermarks, tumbling and sliding windows, KeyedProcessFunction with timers, ValueState Z-score profiles, late-data side outputs — 4 stateful detectors wired and explained.

✓VelocityDetector + SpendingVelocityDetector (tumbling + sliding)
✓CustomerProfileFunction (ValueState Z-score)
✓RapidEscalationDetector + LateDataHandler (timers + sideOutputLateData)

03~7h

Production envelope: exactly-once + K8s

Transactional Kafka sink with EOS, Elasticsearch bulk sink, Flink K8s Operator deployment with ZK HA and RocksDB → S3. The job that runs in module 01 — but deployed the way fintech runs streaming.

✓Kafka EOS sink + Elasticsearch bulk sink + schema-evolution deserializer
✓FlinkDeployment CRD + ZK 3-node HA + incremental RocksDB → S3
✓Prometheus reporter + savepoint-driven upgrade workflow

Project setup · 10 minutes

One command. Local Flink + Kafka + Postgres stack.

You get a real stack on day one — Kafka with Zookeeper, Flink JM/TM, Postgres seed, and 5,000 synthetic transactions ready to replay (with seeded fraud patterns).

What lives in the repo

Everything you need to stand up a production-shaped streaming stack on your laptop, plus the seed data and starter implementations referenced in modules 01–04.

docker-compose.yml — Zookeeper, Kafka, Flink JM/TM, Postgres
src/main/java/.../jobs/ — TransactionConsumerJob + FraudDetectionJob skeletons
src/main/java/.../detectors/ — HighValue, Velocity, SpendingVelocity, CustomerProfile, RapidEscalation
src/main/java/.../sinks/ — AlertKafkaSink + ElasticsearchAlertSink
src/test/java/ — ProcessFunctionTestHarness scaffolding
k8s/ — FlinkDeployment CRD + ZK StatefulSet reference manifests
fixtures/ — transactions.jsonl (5,000 events), customers.csv (500 rows), rules.json (20 broadcast rules), expected_alerts.jsonl (298 alerts)

Download · Starter Kit

Flink Fraud Detection Starter Kit

Maven Java 17 project (Flink 1.20), Docker Compose stack, replay producer, pre-built JSONL fixtures, and reference K8s manifests. Skip the boilerplate, start on module 01.

~290 KB · 63 files · seeded fraud fixtures · PRO required

~/projects/flink-fraud-detection — zsh

1. Clone and start the stack

$ git clone github.com/ai-de/p01-flink-fraud-detection

$ cd p01-flink-fraud-detection && docker compose up -d

2. Create Kafka topics + replay fixtures

$ make topics # transactions, fraud-alerts, dlq

$ make replay # streams 5,000 fixtures into transactions

3. Build and submit the Flink job

$ mvn package -DskipTests

$ flink run -d target/fraud-detection-1.0.jar

4. Verify alerts on the Kafka sink topic

$ docker exec kafka kafka-console-consumer.sh \

$ --bootstrap-server localhost:9092 \

$ --topic fraud-alerts --from-beginning | head

5,000

Replay transactions

500

Customer profiles

Broadcast rules

298

Expected alerts (regression baseline)

Production hardening

The same pipeline — but built for the 10x case.

Most Flink tutorials show you the DataStream. This one shows what changes when checkpoints fail mid-window, the alert topic is consumed by three teams, and the rules need to update without a redeploy.

Tutorial-grade versionWhat you have today

Checkpoint storage

file:///tmp/flink-checkpoints

Alert sink

alerts.print() to stdout

Delivery guarantee

Default at-least-once Kafka producer

State backend

Heap state — fine for 10MB, breaks at 10GB

Bad records

Deserializer returns null and drops them silently

Job upgrade

Stop the job, lose state, hope for the best

Rule updates

Hardcoded thresholds — redeploy to change a number

Your production versionModule 03–04

✓

Checkpoint storage

s3://... with state.backend.incremental: true

✓

Alert sink

KafkaSink with EXACTLY_ONCE + transactional ID prefix

✓

Delivery guarantee

2PC barrier — pre-commit on checkpoint, commit on confirm; read_committed consumers

✓

State backend

EmbeddedRocksDBStateBackend with managed memory + tuned write buffer

✓

Bad records

Side output to dlq Kafka topic + alert metric

✓

Job upgrade

kubectl annotate savepoint trigger — atomic, replayable

✓

Rule updates

Broadcast state from rules Kafka topic — hot reload without redeploy

PRO benefit · code review

Real review from senior engineers who shipped this stack.

Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.

4 reviews / month

Submit a repo, a PR, or a refactor proposal. Reviewer is matched to your domain — Flink/Kafka for this project. Async, comments inline, average turnaround 31 hours.

31h

avg turnaround

9.2/10

helpfulness

94%

return next month

2 office hours / month

Live 30-min sessions with a senior streaming engineer. Architecture questions, whiteboard a tricky watermark, mock a system-design interview. Group sessions also available.

30 min

per session

2 / mo

included

+ group

unlimited

What PRO unlocks

One subscription. 15+ projects, all curriculum, code review.

PRO is built for senior+ engineers who want production-grade builds and feedback loops — not more tutorials.

What you getFREEPROEXPERT

Projects

Production-grade builds

15+

Curriculum modules

All 7 tracks

Phase 1 only

All

All + bonus

Code review credits

Senior engineer review

4 / month

Unlimited

Career path access

5 paths × full plans

1 path

All 5

All 5 + 1:1

Certificate

Verifiable on LinkedIn

—

Yes

Yes + portfolio review

Community

Discord + office hours

Read-only

Full + 2/mo

Full + 4/mo

$29/mo

billed monthly · cancel anytime

or annual

$249/yr save 28%

Upgrade to PRO →

Who this is for

Pick this if you’re shipping at scale, not learning to.

Data engineers moving from batch to streaming

You've shipped dbt + Airflow pipelines. You want to know what changes when the data is unbounded — watermarks, state, exactly-once, recovery.

Backend engineers interviewing at fintech

Stripe, Block, Adyen, and Plaid expect you to whiteboard streaming fraud. After this you can talk through 2PC, keyed state, and operator deployment without hand-waving.

Platform engineers scoping streaming infra

You're sizing a real-time alerting workload for the org. You need to understand the failure modes, the K8s Operator pattern, and what to put behind a service before signing off.

Staff / tech leads

You're driving the move off nightly batch risk scoring. You need to understand watermarks, late data, exactly-once trade-offs, and the operational footprint before the rest of the team commits.

Related curriculum

Going deeper? Three tracks back this project.

Flink fundamentals are the spine. These three curriculums let you go deeper on the layers that matter most — Kafka internals, streaming primitives, and event schema design.

FAQ

Quick answers.

What's NOT in scope?+

We don't ship a Grafana dashboard JSON, GitHub Actions CI/CD workflow, async-I/O ML scorer, or a Debezium PostgreSQL CDC source. The starter scaffolds extension hooks for those (broadcast-state pattern, side outputs, Prometheus reporter) so you can wire them in your own deployment — but the project's commitment is the streaming + state + exactly-once + K8s spine.

How is this different from the kafka-event-routing project (P02)?+

P02 is a system-design portfolio (no code, 69 artifacts, the Uber event platform redesign). This is hands-on Java code on Flink's DataStream API. P02 teaches you to defend a design; this teaches you to ship one.

Do I need cloud credentials?+

No for parts 1–3 (local Docker Compose: Kafka + Flink + Postgres). Part 4 K8s deployment uses reference manifests — you can run it on Kind locally, or skip the deploy step and read the FlinkDeployment CRD as a system-design artifact.

Will this run on my laptop?+

Parts 1–3 yes, ~6GB RAM is comfortable (Kafka + 2 Flink containers + Postgres). Part 4 K8s requires Kind or a real cluster — the manifests assume S3 + ZK persistent storage.

Why 5 detectors and 20 rules — which number is real?+

Both. The code ships 5 stateful detector classes (HighValue, Velocity, SpendingVelocity, CustomerProfile Z-score, RapidEscalation) — those are the patterns you learn. The starter zip ships 20 broadcast-state rule definitions in rules.json — the catalog you'd hot-reload from a Kafka topic in production.

What does PRO actually unlock for $29/mo?+

All 15+ PRO projects, 4 code-review credits per month, 2 office-hours sessions, full curriculum across all 7 tracks, all 5 career paths, certificate of completion, and full community access. Cancel anytime.

Will this help with senior+ data / backend interviews?+

Yes. Real-time stream processing is a top topic for senior data and backend roles, especially at fintech. After this you can whiteboard watermarks, exactly-once via 2PC, keyed state vs broadcast state, and the trade-offs between Application Mode and Session Mode on K8s.

Ready to ship a real streaming pipeline?

Start with module 01 — free, no card. About 3-4 hours. By the end you'll have Flink + Kafka running locally and a unit-tested HighValue detector emitting through a side output channel.

See PRO benefits

P01 · Flink Fraud Detection · PRO · module 01 freeUpgrade to PRO →

Build aproduction-gradefraud detection pipeline on Apache Flink

Streaming is the default for fraud, risk, and routing.

Exactly-once is table-stakes

Keyed state replaces nightly batch

K8s Operator pattern

Interview signal

Module 01 is free. The rest unlocks with PRO.

Apache Flink & Stream Processing

Three sprints. Three checkpoints. One production fraud pipeline.

One command. Local Flink + Kafka + Postgres stack.

What lives in the repo

Flink Fraud Detection Starter Kit

The same pipeline — but built for the 10x case.

Real review from senior engineers who shipped this stack.

4 reviews / month

2 office hours / month

One subscription. 15+ projects, all curriculum, code review.

Pick this if you’re shipping at scale, not learning to.

Data engineers moving from batch to streaming

Backend engineers interviewing at fintech

Platform engineers scoping streaming infra

Staff / tech leads

Going deeper? Three tracks back this project.

Kafka Streams Learning Path

Real-Time Streaming Architecture

Event Design & Data Contracts

Quick answers.

Ready to ship a real streaming pipeline?

Build a
production-grade
fraud detection pipeline on Apache Flink