How to Create a Kafka Topic and Produce Your First Event
Start a Kafka broker with Docker, create a topic with kafka-topics.sh, produce events with a Python producer, and consume them with a consumer group — all running locally in under 10 minutes.
Start Kafka with Docker
# docker-compose.yml
services:
kafka:
image: bitnami/kafka:3.6
environment:
KAFKA_CFG_NODE_ID: 0
KAFKA_CFG_PROCESS_ROLES: controller,broker
KAFKA_CFG_LISTENERS: PLAINTEXT://:9092,CONTROLLER://:9093
KAFKA_CFG_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092
ports:
- '9092:9092'docker-compose up -dThis uses KRaft mode (no ZooKeeper required). Kafka will be available at localhost:9092.
Create a topic
# Create a topic with 3 partitions
docker exec kafka kafka-topics.sh \
--create \
--topic user-events \
--partitions 3 \
--replication-factor 1 \
--bootstrap-server localhost:9092
# List all topics
docker exec kafka kafka-topics.sh \
--list --bootstrap-server localhost:9092Use --replication-factor 1 for local development. In production, use at least 3 for fault tolerance.
Produce events with Python
pip install kafka-python
from kafka import KafkaProducer
import json, time
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode()
)
for i in range(10):
producer.send('user-events', {
'user_id': i, 'action': 'view', 'ts': time.time()
})
producer.flush() # ensure all messages are sentAlways call producer.flush() before exiting — otherwise buffered messages may be lost.
Consume events with a consumer group
from kafka import KafkaConsumer
import json
consumer = KafkaConsumer(
'user-events',
bootstrap_servers=['localhost:9092'],
group_id='analytics-pipeline', # consumer group
auto_offset_reset='earliest', # read from beginning
value_deserializer=lambda v: json.loads(v)
)
for msg in consumer:
print(f'partition={msg.partition} offset={msg.offset} value={msg.value}')The group_id identifies the consumer group. Each group tracks its own offset — a new group always starts from the beginning when auto_offset_reset='earliest'.
Check consumer lag
# Check consumer group lag
docker exec kafka kafka-consumer-groups.sh \
--describe \
--group analytics-pipeline \
--bootstrap-server localhost:9092
# Output: PARTITION | CURRENT-OFFSET | LOG-END-OFFSET | LAG
# 0 | 10 | 10 | 0 <- no lagConsumer lag = LOG-END-OFFSET - CURRENT-OFFSET. Non-zero lag means the consumer is falling behind. Alert when lag exceeds your SLA threshold.
When to Create Topics Programmatically
- •Use CLI (kafka-topics.sh) for one-off setup and local development
- •Use AdminClient API in Python/Java for infrastructure-as-code and CI/CD pipelines
- •Use Terraform + Confluent provider for managed Kafka (Confluent Cloud, MSK)
- •Always set explicit retention and partition count — don't rely on defaults in production
Common Issues
Connection refused on localhost:9092
Kafka is not yet ready. Wait 10–15 seconds after docker-compose up. Check logs with: docker logs kafka
Messages not appearing in consumer
Check auto_offset_reset — if set to "latest", new consumers only see messages produced after they start. Set to "earliest" to read from the beginning.
Consumer group stuck / lag not decreasing
Your consumer may be crashing silently. Add exception handling around message processing. Check for deserialization errors with malformed messages.
Too many partitions slowing things down
Each partition adds overhead (file handles, replication traffic). For local dev, start with 1–3 partitions. Only increase when you need consumer parallelism.
FAQ
- How do I create a Kafka topic?
- Use the kafka-topics.sh CLI: kafka-topics.sh --create --topic my-topic --partitions 3 --replication-factor 1 --bootstrap-server localhost:9092. You can also create topics via AdminClient API or automatically on first publish.
- How many partitions should a Kafka topic have?
- Start with partitions = the number of consumer instances you plan to run. You can't reduce partitions after creation, so over-provision. For high-throughput topics, 12–100 is common.
- What is a Kafka consumer group?
- A set of consumers that together read all partitions of a topic. Each partition is assigned to exactly one consumer in the group. Multiple groups can read the same topic independently.