Build a
real-time
fraud detection topology on Kafka Streams
Ship a stateful Kafka Streams pipeline with KRaft + Avro + Schema Registry, 5-tier risk-score branching, KStream-KTable enrichment joins, Spring Boot Interactive Queries REST, and Strimzi K8s manifests with Exactly-Once Semantics v2 — no Flink cluster required.
This is the system-design question every streaming team gets asked at Pinterest, Walmart, NYT and any Kafka-centric shop: how do you build a stateful, exactly-once fraud detection pipeline without standing up a separate Flink cluster?
- Single-broker KRaft Kafka + Confluent Schema Registry + Avro Transaction schema running in Docker
- 5-tier risk-score branch topology (RuleFilters + VelocityChecker 5-min tumbling + BranchProcessor) emitting 12+ output topics
- DLQ pattern with custom Processor for validation errors + alert severity routing (CRITICAL / HIGH / MEDIUM)
- KStream-KTable enrichment joins (customer profile + merchant reputation) with materialized state stores
- Spring Boot Interactive Queries REST API querying RocksDB-backed state stores
- Strimzi 3-broker kafka-cluster.yaml + EOSConfiguration.java (EOS v2) + Grafana dashboard JSON + AlertManager rules
Kafka Streams is the library answer to streaming.
Most teams reach for Flink and inherit a separate cluster, separate scheduler, separate ops surface. Kafka Streams runs in-process — same JAR, same K8s deployment — and that's the choice Pinterest, NYT, and Walmart made.
In-process > separate cluster
Kafka Streams is a library, not a runtime. You package one Java app, deploy it on K8s, and the topology is the program. No JobManager, no TaskManagers, no separate infra to operate.
Stateful processing without Flink
RocksDB state stores + changelog topics give you durable, queryable state. KTable joins, session windows, and tumbling aggregations all work without a checkpoint coordinator.
Exactly-once that ships
EOS v2 = transactional producers + read_committed consumers + idempotent state. PROCESSING_GUARANTEE_CONFIG=EXACTLY_ONCE_V2 is one config line; the rest is correctness.
Strimzi turns the messy parts into YAML
3 brokers, transaction-state replication, TLS listeners, HPA on the streams app — all declarative. The K8s operator pattern is how Kafka actually runs in production now.
Module 01 is free. The rest unlocks with PRO.
Try the first 2-3 hours — stand up KRaft Kafka, register an Avro schema, ship a producer + consumer in real Java. If it clicks, upgrade to unlock the fraud topology, KTable joins, and the Strimzi packaging.
Kafka Streams Learning Path
This curriculum is the foundation for the project — not a sales add-on. PRO subscribers get full access to every module.
Three sprints. Three checkpoints. One fraud detection topology.
Each phase ends with a tagged commit and a working artifact. No ambiguity about where you are in the build.
KRaft Kafka + Schema Registry + Avro producer running locally. 5-tier risk-score branch topology with VelocityChecker (5-min tumbling) + DLQ + alert severity routing emitting to 12+ topics.
- ✓docker-compose.yml + Transaction.avsc + TransactionProducer
- ✓RuleFilters + VelocityChecker + BranchProcessor + DLQHandler
- ✓AlertStream merging 3 alert streams by severity
KTable joins for customer + merchant context. Spring Boot Interactive Queries REST API exposing materialized state stores. Kafka Connect JSON configs for JDBC + ES sinks.
- ✓CustomerProfileTable + MerchantReputationTable
- ✓EnrichmentJoins with leftJoin + combined fraud score
- ✓Spring REST QueryController + Connect connector JSONs
Strimzi 3-broker kafka-cluster.yaml + EOSConfiguration.java with EOS v2 + Grafana dashboard JSON + AlertManager routing rules + deploy.yml CI/CD.
- ✓Strimzi Kafka custom resource + HPA
- ✓EOSConfiguration.java (EXACTLY_ONCE_V2)
- ✓Grafana JSON + AlertManager rules + deploy.yml
One command. Local KRaft Kafka + Schema Registry + sample fixtures.
You get the full streaming stack on day one — KRaft Kafka (no ZooKeeper), Confluent Schema Registry, Kafka UI for inspection, and 1k pre-built sample transactions across the schema you'll author in module 01.
What lives in the repo
Everything you need to run the Kafka Streams fraud topology on your laptop, plus the Avro schemas and verification fixtures used in modules 02–04.
- docker-compose.yml — KRaft Kafka, Schema Registry, Kafka UI
- src/main/avro/ — Transaction.avsc + 4 Avro schemas
- src/main/java/com/streamguard/ — producers, consumers, fraud topology
- src/main/java/com/streamguard/fraud/ — RuleFilters, VelocityChecker, BranchProcessor, DLQHandler
- k8s/ — Strimzi kafka-cluster.yaml + HPA + ConfigMap manifests
- monitoring/ — Grafana dashboard JSON + AlertManager routing rules
StreamCart Analytics Starter Kit
Kafka Streams DSL scaffolds, 4 Avro schemas, Schema Registry registration helper, 1k sample transactions, K8s + Strimzi manifests, Grafana dashboard JSON, and a smoke-test script. `bash scripts/smoke_test.sh` runs after `docker compose up -d`.
The same topology — but built for the 10x case.
Most Kafka Streams tutorials hand you a KStream.foreach(). This one shows what changes when EOS v2 has to survive a broker reboot, the schema registry rejects a renamed field, and the alert routing actually pages a human.
min.insync.replicas=2kubectl rollout status gatesnum.standby.replicas=1) + remote checkpoints on EBSFULL_TRANSITIVE compatibilitychaos-mesh experiments: broker kill, network partition, disk fullReal review from senior engineers who shipped this stack.
Submit your repo, get line-by-line feedback within 48 hours. The kind of review that's quietly worth thousands of dollars in time-to-staff.
4 reviews / month
Submit a repo, a PR, or a refactor proposal. Reviewer is matched to your domain — Kafka Streams / Strimzi / EOS for this project. Async, comments inline, average turnaround 31 hours.
2 office hours / month
Live 30-min sessions with a senior data engineer. Architecture questions, whiteboard a tricky migration, mock a system-design interview. Group sessions also available.
One subscription. 15+ projects, all curriculum, code review.
PRO is built for engineers who want production-shaped builds and feedback loops — not more tutorials.
Pick this if you’re shipping streaming in Java, not just learning Kafka concepts.
Streaming engineers
You’ve consumed from a Kafka topic, but the stateful side — KTable joins, sessions, EOS — is still abstract. This makes it concrete in real Java.
Data engineers
You run a batch warehouse but the org wants real-time fraud scoring. You need a streaming pattern you can defend without buying into the Flink operator stack.
Platform engineers
You operate Kafka for 5+ teams. You want a Strimzi-on-K8s reference your downstream teams can copy without burning your ops budget on cluster sprawl.
Backend engineers crossing over
You write Java services. The data side is opaque. This bridges from Spring Boot REST → Kafka Streams topology in language you already speak.
Going deeper? Three tracks back this project.
Kafka Streams is the spine. These three curriculums let you go deeper on the substrate the topology sits on — event design, system design, and kafka-streams itself.
Quick answers.
Ready to ship a real streaming topology?
Start with module 01 — free, no card. About 2-3 hours. By the end you'll have KRaft Kafka + Schema Registry + your first Avro-serialized Kafka Streams topology running locally with Java.