Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Number of Zookeepers in a 3 rack cluster with data nodes under 100

Solved Go to solution

Number of Zookeepers in a 3 rack cluster with data nodes under 100

Contributor

Should 3 be sufficient for a 3 rack cluster with one ZK per rack? Does increasing ZK nodes to 5 make sense? My understanding is that for fault tolerant 3 ZKs are good enough. Having 2 ZK nodes on the same rack doesn't increase HA.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Number of Zookeepers in a 3 rack cluster with data nodes under 100

Super Guru

Well, this really depends on your tolerance for failure. Zookeeper requires a quorum of servers to be up at any time. It uses a majority quorum to make a decision. Zookeeper is up when ceil(N/2) servers are up where N are total number of servers in the quorum. For 3 node zookeeper, you can tolerate one failure. For 5 node zookeeper, you can tolerate up to 2 failures. the reason I would recommend 5 zookeeper nodes in your case is because you have a 100 node cluster. To make sure your business continuity and be confidently tolerate couple of failures, it's better to go with 5 zookeepers.

Also, think about planned maintenance. With five zookeepers, you can take one out for maintenance and still have a tolerance of one more failure. With three zookeepers, maintenance is also a challenge.

That being said, now that you know the implications of going with 3 vs 5 zookeepers, you can decide to go with three zookeepers knowing that in case of one zookeeper failure, you have limited window to bring the failed zookeeper up because one more zookeeper failure means risk to business.

4 REPLIES 4

Re: Number of Zookeepers in a 3 rack cluster with data nodes under 100

Super Guru

Well, this really depends on your tolerance for failure. Zookeeper requires a quorum of servers to be up at any time. It uses a majority quorum to make a decision. Zookeeper is up when ceil(N/2) servers are up where N are total number of servers in the quorum. For 3 node zookeeper, you can tolerate one failure. For 5 node zookeeper, you can tolerate up to 2 failures. the reason I would recommend 5 zookeeper nodes in your case is because you have a 100 node cluster. To make sure your business continuity and be confidently tolerate couple of failures, it's better to go with 5 zookeepers.

Also, think about planned maintenance. With five zookeepers, you can take one out for maintenance and still have a tolerance of one more failure. With three zookeepers, maintenance is also a challenge.

That being said, now that you know the implications of going with 3 vs 5 zookeepers, you can decide to go with three zookeepers knowing that in case of one zookeeper failure, you have limited window to bring the failed zookeeper up because one more zookeeper failure means risk to business.

Re: Number of Zookeepers in a 3 rack cluster with data nodes under 100

Contributor

The more, the better, to some extent.

Highlighted

Re: Number of Zookeepers in a 3 rack cluster with data nodes under 100

Contributor
@mqureshi

What if I set up 4 zookeeper node? The quorum would be ceil (4/2) = 2, but shouldn't it be 3 (n+1)/2.

Re: Number of Zookeepers in a 3 rack cluster with data nodes under 100

@ScipioTheYounger

@mqureshi recommendations are correct. If you have a good monitoring in place and you must have one, 3 zookeepers should be enough. If one fails, you would have a split brain. If you had five and one fails down you still have a split quorum. As you can see, 5 is better than 3 only if 2 fail at the same time which is unlikely. Otherwise, you must have real-time monitoring and recovery. I would add that while you can share zookeepers across multiple services in Data Platform, some organizations prefer to allocate zookeepers specific to their Kafka cluster. In that case you would have 3 zookeepers for Kafka and probably Storm since it is a quite common combo and 3 zookeepers for other services.

Anyhow:

- monitor the state of your zookeepers

- put in place an automated recovery

- use 5 zookeepers that makes you more comfortable than 3

If any of the responses to your question addressed the problem don't forget to vote and accept the answer. If you fix the issue on your own, don't forget to post the answer to your own question. A moderator will review it and accept it.