Created 10-30-2015 04:42 AM
What other services are best to colocate on a host with Zookeeper, and how does this change as number of hosts increases?
Does it make sense not to run it on a host with HA services, since those are what it protects? If running on a NodeManager, what adjustments should be made to memory available for YARN containers?
Created 10-30-2015 09:33 AM
Generally, Its ok to deploy ZK with other components (dedicated server not required). As you know , odd number of zk is the best practice and I see no issues in deploying in HA node but I would deploy in non HA node too to keep the balance.
Here are some common problems you can avoid by configuring ZooKeeper correctly:
To avoid swapping, try to set the heapsize to the amount of physical memory you have, minus the amount needed by the OS and cache. The best way to determine an optimal heap size for your configurations is to run load tests. If for some reason you can't, be conservative in your estimates and choose a number well below the limit that would cause your machine to swap. For example, on a 4G machine, a 3G heap is a conservative estimate to start with.
Created 10-30-2015 09:33 AM
Generally, Its ok to deploy ZK with other components (dedicated server not required). As you know , odd number of zk is the best practice and I see no issues in deploying in HA node but I would deploy in non HA node too to keep the balance.
Here are some common problems you can avoid by configuring ZooKeeper correctly:
To avoid swapping, try to set the heapsize to the amount of physical memory you have, minus the amount needed by the OS and cache. The best way to determine an optimal heap size for your configurations is to run load tests. If for some reason you can't, be conservative in your estimates and choose a number well below the limit that would cause your machine to swap. For example, on a 4G machine, a 3G heap is a conservative estimate to start with.
Created 11-01-2015 01:04 AM
Excellent tips, thank you.
Is there a guideline for when to add another pair of ZK servers? Cluster size, number of services that use ZK, any services that are particularly demanding, etc?