Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

why kafka should be un-even number

we hear that kafka should be an odd number to avoid split-brain scenarios!

but can we get more info about this ?

why kafka should be odd number?

we want to create the following ambari cluster based on HDP version 2.6.5

master machines - 3

kafka machines - 17

worker machines - 160

Michael-Bronson
6 REPLIES 6

Super Collaborator

There is no such rule for Kafka Brokers.

Zookeeper should maintain a quorum or (n/2 + 1) total machines (of n) that agree on leader-election values and locks, that results in a total odd number to accommodate for hardware and network failure scenarios.

From "Kafka - The Definitive Guide", as well as Apache Zookeeper site, you generally will have negative side effects from having more than 5 or 7 Zookeeper servers total serving applications using it.

You should have more than 3 Zookeepers because if one goes down, you are only left with 2, which results in that "split brain". With 5 servers, two can go down, and you still have 2 servers + 1 available for the "tie breaker" vote. For 7, you can loose up to 4 zookeepers and still be good.

so just to summary what you said do you mean that we need min 5 zookeeper server for 17 kafka machines ?

or in other words

how many zookeeper you suggest for the following cluster nodes:

master machines - 3

kafka machines - 17

worker machines - 160

Michael-Bronson

Super Collaborator

@Michael Bronson - The terms "master/worker" don't really mean anything in Kafka terms.

17 Kafka brokers seems like a lot (we have about that many brokers in AWS handling about 2million messages per day), but yes, a minimum of 5 ZKs is encouraged to account for maintenance and hardware failure, as mentioned.

@Jordan Moore

what are the risks if we still use only 3 zookeeper servers with 17 kafka machines ?

Michael-Bronson

Super Collaborator

@Michael Bronson - Well, the obvious; Kafka Leader election would fail if only one Zookeeper stops responding. Your consumers and producers wouldn't be able to determine which topic partition should serve any requests.

Hardware fails for a variety of reasons, and it would be better if you converted two of the 160 available worker nodes to be dedicated Zookeeper servers.

@Michael Bronson ZooKeeper needs an odd number of hosts so it can build a quorum. A 3 node cluster can survive the loss of 1 node. It will fail if there is a simultaneous loss of 2 nodes (for example a node fails during an upgrade). If zookeeper goes down the brokers will not operate.

Designing a ZooKeeper deployment explains:

"For the ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines. Thus, a deployment that consists of three machines can handle one failure, and a deployment of five machines can handle two failures. Note that a deployment of six machines can only handle two failures since three machines is not a majority. For this reason, ZooKeeper deployments are usually made up of an odd number of machines."

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.