Apache Kafka is one of the most popular open-source data streaming platforms adopted by companies like Netflix, Spotify, LinkedIn, Uber to build mission-critical applications. In this comprehensive 3500+ word guide, we take an in-depth look at deploying a Kafka cluster using Docker Compose for development or testing.
Overview:
- Introduction to Apache Kafka
- Deploying Kafka Cluster with Docker Compose
- Working with Kafka Cluster
- Kafka Cluster Administration
- Comparison of Deployment Options
- Kafka Security Considerations
- Conclusion
Let’s get started.
Introduction to Apache Kafka
Apache Kafka is an open-source, highly scalable, low latency platform that enables storing and processing streams of data in a fault-tolerant way. Before we deploy Kafka, let‘s understand its architecture and core concepts.
Kafka Architecture
The following diagram shows the high-level architecture of Apache Kafka:

A Kafka cluster primarily consists of the following components:
Brokers: Kafka brokers are server nodes that receive messages from producers and store them on disk. Data is replicated across brokers for fault tolerance using a replication factor.
Topics and Partitions: Kafka topics are logical streams of data. A topic is split into ordered partitions which contain messages in an immutable sequence. Partitions allow parallelism by distributing data across brokers.
Producers: Producers are applications that publish data to Kafka brokers. The broker assigns the message to the correct partition.
Consumers: Consumers subscribe to topics and process messages published by producers. Consumers track their read progress within each partition.
ZooKeeper: Zookeeper provides coordination between brokers and consumers/producers by electing a controller, managing service registry and cluster metadata.
This architecture allows Kafka to be massively scalable and achieve very high throughput for reading and writing streams of data.
Key Concepts of Kafka
Some key concepts related to Kafka streams and partitions:
-
Retention: Kafka lets you retain streams of data durably by configuring retention period on topics. For example, setting retention to 7 days allows replaying data up to 7 days old.
-
Multiple Subscribers: A topic can have many consumer groups subscribing to it in parallel isolating streams per consumer group. For example, a payments and fraud analysis group can consume same stream independently.
-
Ordering Guarantees: Consumers read records in order within a partition avoiding out-of-order issues in stream processing.
-
Horizontal Scalability: Kafka partitions allow distributing topic data across many brokers achieving great horizontal scale.
-
High Availability: Data is replicated across brokers using a replication factor avoiding data loss in case of broker failures.
Now that we understand Kafka architecture and core concepts, let‘s deploy a Kafka cluster using Docker.
Deploying Kafka Cluster with Docker Compose
For development and testing, Kafka can be conveniently deployed on a single host using Docker containers. We will use Docker Compose to deploy a 3 broker Kafka cluster along with Zookeeper.
Docker Compose File for Kafka Cluster
Here is the docker-compose.yml file that will start a Kafka cluster:
version: ‘3‘
services:
zookeeper:
image: confluentinc/cp-zookeeper:7.3.0
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
kafka-1:
image: confluentinc/cp-kafka:7.3.0
container_name: kafka-1
ports:
- "9091:9091"
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: ‘zookeeper:2181‘
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER://kafka-1:9091
kafka-2:
image: confluentinc/cp-kafka:7.3.0
container_name: kafka-2
ports:
- "9092:9092"
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 2
KAFKA_ZOOKEEPER_CONNECT: ‘zookeeper:2181‘
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER://kafka-2:9092
kafka-3:
image: confluentinc/cp-kafka:7.3.0
container_name: kafka-3
ports:
- "9093:9093"
depends_on:
- zookeeper
environment:
KAFKA_BROKER_ID: 3
KAFKA_ZOOKEEPER_CONNECT: ‘zookeeper:2181‘
KAFKA_ADVERTISED_LISTENERS: LISTENER_DOCKER://kafka-3:9093
Let‘s analyze the docker-compose file:
- We are using official Confluent Kafka images tagged as 7.3.0
- The cluster consists of 1 Zookeeper instance and 3 Kafka brokers
- Zookeeper runs on port
2181and Kafka brokers run on ports9091,9092and9093 - The Kafka broker ids are set from
1to3using theKAFKA_BROKER_IDenvironment variable - Kafka connects to zookeeper via the container name
zookeeperat port2181 - The
ADVERTISED_LISTENERSare set using the LISTENER_DOCKER protocol for inter-broker communication
With this configuration, we are ready to start our containers.
Starting Kafka Cluster Containers
Launch the containers in detached mode using docker-compose:
docker-compose up -d
Verify that the containers are running using docker ps:
CONTAINER ID IMAGE STATUS PORTS NAMES
d8e2f611a263 confluentinc/cp-kafka:7.3.0 Up 5 seconds 0.0.0.0:9093->9093/tcp kafka-3
475e75f658d5 confluentinc/cp-kafka:7.3.0 Up 5 seconds 0.0.0.0:9092->9092/tcp kafka-2
cbe4146ef60d confluentinc/cp-kafka:7.3.0 Up 6 seconds 0.0.0.0:9091->9091/tcp kafka-1
412aaa80172b confluentinc/cp-zookeeper:7.3.0 Up 6 seconds 2181/tcp, 2888/tcp, 0.0.0.0:2181->2181/tcp zookeeper
Our 3 broker Kafka cluster is now ready!
Let‘s test it out by producing and consuming messages.
Working with Kafka Cluster
We can connect to Kafka brokers from the host machine to publish and consume messages.
Produce Messages to Kafka Topic
Let‘s produce a few messages to the test topic using:
kafka-console-producer --topic test --bootstrap-server localhost:9091
Then in the console, type some messages and hit enter:
Hello Kafka!
This is my first Kafka message
Learning Kafka with Docker
This publishes the messages to the test topic on broker at port 9091.
Consume Published Messages
In another terminal, consume the published messages starting from beginning:
kafka-console-consumer --topic test --from-beginning --bootstrap-server localhost:9091
You should see the messages consumed:
Hello Kafka!
This is my first Kafka message
Learning Kafka with Docker
Likewise, you can connect producers and consumers to brokers running at ports 9092 and 9093 as well.
Our Kafka cluster works correctly! Let‘s look at how to manage and monitor it.
Kafka Cluster Administration with Control Center
Confluent Control Center provides GUI interface to manage and monitor Kafka clusters including broker monitoring, topic management, and schema registry.
Integrate Control Center
We update the docker-compose file to add Control Center:
services:
# Existing zookeeper and kafka services
control-center:
image: confluentinc/cp-enterprise-control-center:7.3.0
hostname: control-center
container_name: control-center
depends_on:
- kafka-1
- kafka-2
- kafka-3
ports:
- "9021:9021"
environment:
CONTROL_CENTER_BOOTSTRAP_SERVERS: ‘kafka-1:9091,kafka-2:9092,kafka-3:9093‘
We connected Control Center to the 3 Kafka brokers.
Access Control Center GUI
Restart the cluster and access Control Center at http://localhost:9021 in browser.
You will see the Overview dashboard with live metrics for brokers, topics, partitions and consumers:

Similarly, check out topic management, schema registry etc. features that Control Center provides for Kafka administration.
Next, let‘s compare this single host Docker Compose deployment option for Kafka with other setups.
Comparison of Kafka Deployment Options
For development/testing purposes, Kafka can be conveniently deployed on a single host using Docker Compose. This keeps resource usage minimal.
However, for production usage there are other recommended deployment options:
| Deployment Method | Description | Benefits | Use Case |
|---|---|---|---|
| Docker Compose | Multiple containers on a single Docker host | Simplified configuration, local development | Prototype, development |
| Docker Swarm | Distributed containers across multiple Docker hosts | High-availability, horizontal scale | Small scale production |
| Kubernetes | Containers managed by Kubernetes cluster manager | Flexibility, reliability, self-healing | Enterprise grade production |
| Confluent Cloud | Fully managed Apache Kafka clusters | No ops, cloud integration capabilities | Public cloud based data streaming |
Table 1: Comparison of popular Kafka deployment options
As shown above, Docker Compose is great for local development but not adequate for large scale production scenarios.
Managed offerings like Confluent Cloud provide fully managed, auto-scaled Kafka clusters on public clouds along with capabilities like:
- Automatic provisioning, scaling and healing
- Out-of-the-box monitoring, security, and role-based access controls
- Over 100+ cloud service integrations (Google Cloud, AWS, Azure)
- Developer self-service access to clusters and topics
For enterprise grade production use cases, managed services eliminate operational complexity.
Now let‘s discuss some Kafka security considerations.
Kafka Security Considerations
Though we haven‘t configured security for simplicity, here are some guidelines for properly securing Kafka clusters:
- Use SSL for encryption between Kafka clients and brokers
- Enable SASL/SCRAM authentication for client connections
- Restrict access with ACLs between clients and allowed topics
- Encrypt inter-broker communication using SSL or IPSec
- Integrate with enterprise authentication systems
- Enable schema validation on producers and consumers
Additional measures like encryption, dedicated VPC networks, firewall policies should also be considered.
For multi-datacenter clusters, rely on the cloud provider’s secure infrastructure.
Finally, let‘s conclude what we learned.
Conclusion
In this comprehensive guide, we covered the following:
- Overview of Kafka architecture and core concepts
- Step-by-step instructions to deploy a 3 node Kafka cluster using Docker Compose
- Examples to produce and consume messages with Kafka brokers
- Integrating Confluent Control Center for Kafka cluster monitoring
- Comparison between Kafka deployment methods
- Security considerations for Kafka in production
Apache Kafka provides a high performance, resilient platform for building streaming data pipelines and applications. For local development purposes, Docker Compose provides a simplified way to run Kafka clusters.
Additionally, cloud-native deployment options make running large scale production Kafka clusters convenient by eliminating operational complexity.
I hope you found this guide useful. Feel free to reach out for any more questions!


