Created 06-27-2019 08:00 AM
we have HDP version 2.6.4 in our cluster
cluster include 285 data node machines and 3 kafka machines
for now zookeeper servers are installed on the masters machines and zookeepers are not independent machines
but since this is very important production cluster
we think to separate the zookeeper from the masters machines
and install the zookeeper servers on separate servers to be independent machines
am I right here?
Created 06-28-2019 02:38 AM
Zookeeper is a light weight process hence it won't consume too much resource.
(EDIT😞 However there is the following recommendation from Hortonworks (For Kafka + Zookeeper) :
Here are several recommendations for ZooKeeper configuration with Kafka:
Do not run ZooKeeper on a server where Kafka is running.
When using ZooKeeper with Kafka you should dedicate ZooKeeper to Kafka, and not use ZooKeeper for any other components.
Make sure you allocate sufficient JVM memory. A good starting point is 4GB.
To monitor the ZooKeeper instance, use JMX metrics.
.
As far as high availability is concerned for zookeepers then you can refer to the following HCC thread which talks about more on How to decide, how many zookeepers should I have?
it depends on your requirement and then based on the requirement we can decide the number of ZK hosts as 3 or 5 (..etc). More ZK comes with a cost so please go through the above thread.
Created 06-28-2019 02:38 AM
Zookeeper is a light weight process hence it won't consume too much resource.
(EDIT😞 However there is the following recommendation from Hortonworks (For Kafka + Zookeeper) :
Here are several recommendations for ZooKeeper configuration with Kafka:
Do not run ZooKeeper on a server where Kafka is running.
When using ZooKeeper with Kafka you should dedicate ZooKeeper to Kafka, and not use ZooKeeper for any other components.
Make sure you allocate sufficient JVM memory. A good starting point is 4GB.
To monitor the ZooKeeper instance, use JMX metrics.
.
As far as high availability is concerned for zookeepers then you can refer to the following HCC thread which talks about more on How to decide, how many zookeepers should I have?
it depends on your requirement and then based on the requirement we can decide the number of ZK hosts as 3 or 5 (..etc). More ZK comes with a cost so please go through the above thread.
Created 06-28-2019 04:11 AM
@dear Jay about - Do not run ZooKeeper on a server where Kafka is running. " , can you tell me which document from hortonworks or confluent support this? I mean is it official statement ?
Created 06-28-2019 04:36 AM
Do not run ZooKeeper on a server where Kafka is running.
The statement is taken from the following doc:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_kafka-component-guide/content/kafka-zook...
Created 06-28-2019 04:41 AM
@Dear jay so from the document we see "When using ZooKeeper with Kafka you should dedicate ZooKeeper to Kafka, and not use ZooKeeper for any other components. "
this mean - if we install the zookeeper on kafka then zookeeper must serve only kafka machines ? , am I correct ?
second - I guess - it is much more better to install the zoo servers on machines that not have other services - for example to install the zoo on clean machine with only redhat OS - am I right here ?
Created 06-28-2019 07:16 AM
The meaning of "you should dedicate ZooKeeper to Kafka," and "Do not run ZooKeeper on a server where Kafka is running." statements are combined.
Zookeeper Servers should not be installed/run on Kafka Broker host.
Kafka should have zookeepers dedicated to it means the Zookeepers which are going to be by Kafka should not be used for other services. Like Kafka Zookeepers should not be used by HBase/ NameNode Failovers /AMS ...etc
Created 06-28-2019 06:30 AM
@dear jay - about - Make sure you allocate sufficient JVM memory. A good starting point is 4GB. , which parameter in ambari need to find in order to set the value to 4G?
Created 06-28-2019 07:20 AM
@Michael Bronson
Zookeeper memory related settings can be specified inside the zookeeper-env script (via ambari zookeeper-env template)
# grep 'SERVER_JVMFLAGS' /etc/zookeeper/3.1.0.0-78/0/zookeeper-env.sh export SERVER_JVMFLAGS=-Xmx4096m
.
Created 07-01-2019 05:40 AM
@Dear Jay - Just want to clear this
we want to installed 3 zookeepers servers that serve only kafka ( and not other application )
in that case can we install the 3 zookeepers servers on 3 kafka hosts ?
or we need to dedicated a new hosts ( without kafka ) for the new zookeepers servers?
if we cant installed the zookeeper servers ( that are only server the kafka ) on kafka hosts
can you please explain why?
Created 07-01-2019 06:11 AM
As per standard recommendation/ best practice: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_kafka-component-guide/content/kafka-zook...
Then for example:
1). Do not run ZooKeeper on a server where Kafka is running.
If Kafka Brokers are installed on node1, node2, node3 then you should have Zookeepers on other cluster nodes where Kafka is not installed like node4, node5, node6.
2). When using ZooKeeper with Kafka you should dedicate ZooKeeper to Kafka, and not use ZooKeeper for any other components.
Means the Zookeepers running on that node node4, node5, node6 should be used only for Kafka. Which means the Zookeeper running on those nodes (node4, node5, node6) should be dedicated to kafka means should not be used for other purpose like HBase/ NameNode Failovers /AMS ...etc