Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Kafka topic | created on which node

Contributor

Hi Experts,

We have 3 nodes of Kafka broker cluster setup along with 3 nodes of the zookeeper. Following are some queries while creating topics in the cluster.

I am running following command to create topic

./kafka-topics.sh --create --zookeeper zookeeper:2181 --replication-factor 1 --partitions 1 --topic portfolio_break_stat

Here are my questions,

1. If I put replication-factor 1 then on which node it will create the topic. It will create on all nodes or only one node. If only one node how Kafka decide it?

2. If I did not mention replication-factor then how the topic is created on the node and which node it will pick up?

1 REPLY 1

Super Collaborator

First off, ideally, to prevent data loss, you should use more than one replica. For better throughput, use more than one partition.

When you describe the topic, it will tell you the leaders for each partition. That will give the broker ID. You will need to make a note of which ID's belong to which machines as well as the data location for each broker to know where the data is stored on those servers.

As for how it determines, there is a leader election algorithm within Zookeeper... probably worth reading over the Kafka documentation / Wiki if you are really curious about that.

Forcing leaders is also possible, http://blog.erdemagaoglu.com/post/128624804243/forcing-kafka-partition-leaders

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.