Skip to content

How to Create a Kafka Topic and Produce Your First Event

Start a Kafka broker with Docker, create a topic with kafka-topics.sh, produce events with a Python producer, and consume them with a consumer group — all running locally in under 10 minutes.

1

Start Kafka with Docker

# docker-compose.yml
services:
  kafka:
    image: bitnami/kafka:3.6
    environment:
      KAFKA_CFG_NODE_ID: 0
      KAFKA_CFG_PROCESS_ROLES: controller,broker
      KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
      KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
    ports:
      - '9092:9092'
docker-compose up -d

This uses KRaft mode (no ZooKeeper required). Kafka will be available at localhost:9092.

2

Create a topic

# Create a topic with 3 partitions
docker exec kafka kafka-topics.sh \
  --create \
  --topic user-events \
  --partitions 3 \
  --replication-factor 1 \
  --bootstrap-server localhost:9092

# List all topics
docker exec kafka kafka-topics.sh \
  --list --bootstrap-server localhost:9092

Use --replication-factor 1 for local development. In production, use at least 3 for fault tolerance.

3

Produce events with Python

pip install kafka-python

from kafka import KafkaProducer
import json, time

producer = KafkaProducer(
    bootstrap_servers=['localhost:9092'],
    value_serializer=lambda v: json.dumps(v).encode()
)

for i in range(10):
    producer.send('user-events', {
        'user_id': i, 'action': 'view', 'ts': time.time()
    })

producer.flush()  # ensure all messages are sent

Always call producer.flush() before exiting — otherwise buffered messages may be lost.

4

Consume events with a consumer group

from kafka import KafkaConsumer
import json

consumer = KafkaConsumer(
    'user-events',
    bootstrap_servers=['localhost:9092'],
    group_id='analytics-pipeline',  # consumer group
    auto_offset_reset='earliest',  # read from beginning
    value_deserializer=lambda v: json.loads(v)
)

for msg in consumer:
    print(f'partition={msg.partition} offset={msg.offset} value={msg.value}')

The group_id identifies the consumer group. Each group tracks its own offset — a new group always starts from the beginning when auto_offset_reset='earliest'.

5

Check consumer lag

# Check consumer group lag
docker exec kafka kafka-consumer-groups.sh \
  --describe \
  --group analytics-pipeline \
  --bootstrap-server localhost:9092

# Output: PARTITION | CURRENT-OFFSET | LOG-END-OFFSET | LAG
#            0      |       10       |       10       |  0  <- no lag

Consumer lag = LOG-END-OFFSET - CURRENT-OFFSET. Non-zero lag means the consumer is falling behind. Alert when lag exceeds your SLA threshold.

When to Create Topics Programmatically

  • Use CLI (kafka-topics.sh) for one-off setup and local development
  • Use AdminClient API in Python/Java for infrastructure-as-code and CI/CD pipelines
  • Use Terraform + Confluent provider for managed Kafka (Confluent Cloud, MSK)
  • Always set explicit retention and partition count — don't rely on defaults in production

Common Issues

Connection refused on localhost:9092

Kafka is not yet ready. Wait 10–15 seconds after docker-compose up. Check logs with: docker logs kafka

Messages not appearing in consumer

Check auto_offset_reset — if set to "latest", new consumers only see messages produced after they start. Set to "earliest" to read from the beginning.

Consumer group stuck / lag not decreasing

Your consumer may be crashing silently. Add exception handling around message processing. Check for deserialization errors with malformed messages.

Too many partitions slowing things down

Each partition adds overhead (file handles, replication traffic). For local dev, start with 1–3 partitions. Only increase when you need consumer parallelism.

FAQ

How do I create a Kafka topic?
Use the kafka-topics.sh CLI: kafka-topics.sh --create --topic my-topic --partitions 3 --replication-factor 1 --bootstrap-server localhost:9092. You can also create topics via AdminClient API or automatically on first publish.
How many partitions should a Kafka topic have?
Start with partitions = the number of consumer instances you plan to run. You can't reduce partitions after creation, so over-provision. For high-throughput topics, 12–100 is common.
What is a Kafka consumer group?
A set of consumers that together read all partitions of a topic. Each partition is assigned to exactly one consumer in the group. Multiple groups can read the same topic independently.

Related

Press Cmd+K to open