Skip to content

Kafka Topics Explained: What They Are and How They Work

A Kafka topic is a named, ordered log of events split into partitions. Each event in a partition gets a sequential offset. Consumer groups track their offset per partition to resume from where they left off — enabling fault-tolerant, parallel, replay-able event consumption.

Topic → Partition → Offset

Topic: user-events (3 partitions)

Partition 0: [offset 0] [offset 1] [offset 2] [offset 3] ...
Partition 1: [offset 0] [offset 1] [offset 2] ...
Partition 2: [offset 0] [offset 1] [offset 2] [offset 3] [offset 4] ...

Consumer Group "analytics":
  Consumer A reads Partition 0  (at offset 3)
  Consumer B reads Partition 1  (at offset 2)
  Consumer C reads Partition 2  (at offset 4)

Core Concepts

Partition

Unit of Parallelism

An ordered, immutable log on a single broker. Events are appended only. A topic with N partitions supports up to N parallel consumers in one group.

Offset

Event Position

Sequential integer identifying each event within a partition. Consumer groups commit their current offset to track progress and resume after failures.

Retention

How Long Events Live

Events are retained by time (default 7 days) or size. Retention is independent of consumption — events aren't deleted when read, only when they expire.

Retention Configuration Options

ConfigDefaultWhen to change
log.retention.hours168 (7 days)Longer for event sourcing; shorter for ephemeral events
log.retention.bytes-1 (unlimited)Set a cap to control disk usage on high-volume topics
log.retention.msDerived from hoursOverride with exact milliseconds for precise control
log.cleanup.policydeleteSet to "compact" for changelog/key-value topics (CDC)
retention.ms per topicInherits brokerOverride per topic: --config retention.ms=3600000

Common Mistakes

Assuming ordering across partitions

Kafka only guarantees ordering within a single partition. If you need global ordering (e.g. all events for one user in order), use a partition key that maps that user to one partition.

Setting retention too short for replay use cases

If you need to backfill a new consumer or reprocess after a bug, 7-day retention may not be enough. Set retention to 30–90 days for event sourcing topics.

Using random partition keys

If your partition key is random (or None), events for the same entity (user, order) land on different partitions. For ordered processing per entity, use a consistent partition key like user_id.

Confusing topic retention with consumer lag

Retention controls when Kafka deletes events from the log. Consumer lag is how far behind a consumer group is. These are independent — a consumer can lag behind even with infinite retention.

FAQ

What is a Kafka topic?
A Kafka topic is a named category where producers publish events. Each topic is split into partitions — ordered, immutable logs distributed across brokers. Topics retain events for a configurable period regardless of consumption.
What is a Kafka partition?
A partition is the fundamental unit of parallelism. Each partition is an ordered log on a single broker. A topic with N partitions supports up to N parallel consumers in one consumer group.
What is a Kafka offset?
An offset is a sequential integer identifying each event within a partition. Consumer groups store their current offset to resume from where they left off after failures.
How long does Kafka retain messages?
Default retention is 7 days. Configure per topic by time (retention.ms), size (retention.bytes), or set -1 for infinite retention. Messages are not deleted when consumed — only when they expire.

Related

Press Cmd+K to open