Support Questions

Find answers, ask questions, and share your expertise

ZK Best Practices

avatar
New Contributor

Hi all,

We run a few hundred nodes of HDP including HDFS, Yarn, HBase, Storm and Kafka primarily. Currently we have 5 zookeeper nodes of which all the components utilize. We've seen recommendations to split the 5 ZK nodes into two quorums (perhaps one for hbase, another for storm). In addition, we've seen recommendations to stick with one quorum and beef up the number of ZK nodes.

What is the consensus across the community on the best way to go? Vote now or forever hold your peace!

Also how is Ambari effected if going the "multi" route?

Thanks a bunch! Kyle

1 ACCEPTED SOLUTION

avatar
Super Guru
@Kyle Travis

To start with please check the following link for some basics which I assume you already know, but I think it's still a good start.

https://community.hortonworks.com/questions/55201/number-of-zookeepers-in-a-3-rack-cluster-with-data...

Now, to the question whether you should have separate zookeeper quorums. You mention "a few hundred nodes". I am assuming it's a HBase cluster and Storm is writing to HBase. Let me make some assumptions about your application.

1. High read/write throughput.

2. Time sensitive. HBase latency is important.

3. Other hadoop components also running, like Hive or Phoenix may be.

4. Kafka is not being used.

If my assumptions are reasonable and close to what you really have, I think having a separate Zookeeper for HBase might give you some benefits. Zookeeper is very sensitive with timeouts. Serving multiple components at this scale, it might make sense to give HBase its own Zookeeper. This would ensure better HBase operation/stability as compared to when you share Zookeeper across. Although, someone might argue that, you shouldn't run into any issue with just one zookeeper either. Have you seen any issues in your testing?

For all the rest of cluster, just one zookeeper quorum is fine.

However, if you are using Kafka, I will have one quorum for everything including HBase and one zookeeper for just Kafka. Kafka is very fragile with Zookeeper. In my experience, it is better Kafka has its own Zookeeper.

This brings to last point of your question. Two zookeepers are currently not supported by Ambari. There is a jira open but no support yet.

https://issues.apache.org/jira/browse/AMBARI-14714

View solution in original post

5 REPLIES 5

avatar
Super Guru
@Kyle Travis

To start with please check the following link for some basics which I assume you already know, but I think it's still a good start.

https://community.hortonworks.com/questions/55201/number-of-zookeepers-in-a-3-rack-cluster-with-data...

Now, to the question whether you should have separate zookeeper quorums. You mention "a few hundred nodes". I am assuming it's a HBase cluster and Storm is writing to HBase. Let me make some assumptions about your application.

1. High read/write throughput.

2. Time sensitive. HBase latency is important.

3. Other hadoop components also running, like Hive or Phoenix may be.

4. Kafka is not being used.

If my assumptions are reasonable and close to what you really have, I think having a separate Zookeeper for HBase might give you some benefits. Zookeeper is very sensitive with timeouts. Serving multiple components at this scale, it might make sense to give HBase its own Zookeeper. This would ensure better HBase operation/stability as compared to when you share Zookeeper across. Although, someone might argue that, you shouldn't run into any issue with just one zookeeper either. Have you seen any issues in your testing?

For all the rest of cluster, just one zookeeper quorum is fine.

However, if you are using Kafka, I will have one quorum for everything including HBase and one zookeeper for just Kafka. Kafka is very fragile with Zookeeper. In my experience, it is better Kafka has its own Zookeeper.

This brings to last point of your question. Two zookeepers are currently not supported by Ambari. There is a jira open but no support yet.

https://issues.apache.org/jira/browse/AMBARI-14714

avatar
New Contributor

That's excellent info, thanks Tim and mqureshi for sharing. It sounds like splitting the ZK's are the way to go. mqureshi, your assumptions are pretty on par except that we are using Kafka as well pretty heavily. We've seen funkiness happen (processing slow downs like Storm waiting on HBase) when there is a lot of concurrent actively on the cluster from the various components.

Interesting on the Ambari JIRA - hopefully that is in the queue to get fixed soon! When is 3.0 supposed to be out again? 🙂

Thanks again for the insight!!!

avatar
Super Guru

Great. If you are using Kafka, then have a separate zookeeper for Kafka. I would never recommend using same Zookeeper for Kafka. Separate that out and the "funkiness" you have seen should go away :).

avatar
Master Guru

The more the nodes in a ZK ensemble (quorum) the faster the reads but the slower the writes. That's because a read can be done from any node, but a write is not complete before all nodes are updated. On top of that, early versions of Kafka (0.8.2 and older) keep Kafka offsets on ZK. Therefore, as already suggested by @mqureshi, it's the best to start by creating a dedicated ZK for Kafka, I'd go for 3 nodes, and keep the 5-node ZK for everything else. Beefing up the number of ZK's to 7 or more is a resounding No. Regarding the installation and management of the new Kafka ZK, it's pretty straightforward to install it manually, just follow the steps in one of "Non-Ambari cluster installation guides" like this one. You can also try to create a cluster composed of only Kafka and ZK and manage it by its own Ambari instance.