Kafka Topics Explained: What They Are and How They Work
A Kafka topic is a named, ordered log of events split into partitions. Each event in a partition gets a sequential offset. Consumer groups track their offset per partition to resume from where they left off — enabling fault-tolerant, parallel, replay-able event consumption.
Topic → Partition → Offset
Topic: user-events (3 partitions)
Partition 0: [offset 0] [offset 1] [offset 2] [offset 3] ...
Partition 1: [offset 0] [offset 1] [offset 2] ...
Partition 2: [offset 0] [offset 1] [offset 2] [offset 3] [offset 4] ...
Consumer Group "analytics":
Consumer A reads Partition 0 (at offset 3)
Consumer B reads Partition 1 (at offset 2)
Consumer C reads Partition 2 (at offset 4)Core Concepts
Partition
Unit of Parallelism
An ordered, immutable log on a single broker. Events are appended only. A topic with N partitions supports up to N parallel consumers in one group.
Offset
Event Position
Sequential integer identifying each event within a partition. Consumer groups commit their current offset to track progress and resume after failures.
Retention
How Long Events Live
Events are retained by time (default 7 days) or size. Retention is independent of consumption — events aren't deleted when read, only when they expire.
Retention Configuration Options
| Config | Default | When to change |
|---|---|---|
| log.retention.hours | 168 (7 days) | Longer for event sourcing; shorter for ephemeral events |
| log.retention.bytes | -1 (unlimited) | Set a cap to control disk usage on high-volume topics |
| log.retention.ms | Derived from hours | Override with exact milliseconds for precise control |
| log.cleanup.policy | delete | Set to "compact" for changelog/key-value topics (CDC) |
| retention.ms per topic | Inherits broker | Override per topic: --config retention.ms=3600000 |
Common Mistakes
Assuming ordering across partitions
Kafka only guarantees ordering within a single partition. If you need global ordering (e.g. all events for one user in order), use a partition key that maps that user to one partition.
Setting retention too short for replay use cases
If you need to backfill a new consumer or reprocess after a bug, 7-day retention may not be enough. Set retention to 30–90 days for event sourcing topics.
Using random partition keys
If your partition key is random (or None), events for the same entity (user, order) land on different partitions. For ordered processing per entity, use a consistent partition key like user_id.
Confusing topic retention with consumer lag
Retention controls when Kafka deletes events from the log. Consumer lag is how far behind a consumer group is. These are independent — a consumer can lag behind even with infinite retention.
FAQ
- What is a Kafka topic?
- A Kafka topic is a named category where producers publish events. Each topic is split into partitions — ordered, immutable logs distributed across brokers. Topics retain events for a configurable period regardless of consumption.
- What is a Kafka partition?
- A partition is the fundamental unit of parallelism. Each partition is an ordered log on a single broker. A topic with N partitions supports up to N parallel consumers in one consumer group.
- What is a Kafka offset?
- An offset is a sequential integer identifying each event within a partition. Consumer groups store their current offset to resume from where they left off after failures.
- How long does Kafka retain messages?
- Default retention is 7 days. Configure per topic by time (retention.ms), size (retention.bytes), or set -1 for infinite retention. Messages are not deleted when consumed — only when they expire.