Support Questions

divakarreddy_a · ‎02-02-2017

Hi,

Currently we are using F5 for hiveserver2 load balance & fail over for all the ODBC & JDBC connections and didn't see any issues since 1 yrs.

But we heard that using Zookeeper Name Space is also we can handle hs2 load balance and fail over for ODBC connections.Is it TRUE? if yes,

Can you suggest me which approach is the best and What others are using for the same.?

Thanks for your suggestions.

mqureshi · ‎02-02-2017

@Divakar Annapureddy

First thing first. Yes it is possible and supported. here is the link.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_hadoop-ha/content/ch_HA-HiveServer2.html

This is going to be my personal preference and opinion, so take it based on how you do things in your organization. First, if it's not broken, why fix it.

Second, and this may be important depending on your utilization and number of requests per second, especially if same zookeeper is being used for things like HBase or even Kafka (Kafka should have its own Zookeeper regardless). Zookeeper is very sensitive to timeouts. That is why one best practice is to give zookeeper its own dedicated disk. If your namenode is the only thing that's being managed by Zookeeper then it's fine but if you have HBase or Kafka already pointing to same Zookeeper, why add one more component especially if that component is working just fine?

As for what others are doing, I am not sure about Zookeeper because I have only seen customers use some load balancer like F5. I can say confidently that Zookeeper approach is less deployed in industry probably because its a new feature.

View solution in original post

mqureshi · ‎02-02-2017

@Divakar Annapureddy

First thing first. Yes it is possible and supported. here is the link.

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_hadoop-ha/content/ch_HA-HiveServer2.html

This is going to be my personal preference and opinion, so take it based on how you do things in your organization. First, if it's not broken, why fix it.

Second, and this may be important depending on your utilization and number of requests per second, especially if same zookeeper is being used for things like HBase or even Kafka (Kafka should have its own Zookeeper regardless). Zookeeper is very sensitive to timeouts. That is why one best practice is to give zookeeper its own dedicated disk. If your namenode is the only thing that's being managed by Zookeeper then it's fine but if you have HBase or Kafka already pointing to same Zookeeper, why add one more component especially if that component is working just fine?

As for what others are doing, I am not sure about Zookeeper because I have only seen customers use some load balancer like F5. I can say confidently that Zookeeper approach is less deployed in industry probably because its a new feature.

divakarreddy_a · ‎02-03-2017

Thanks for your inputs.

Reason for change: we can avoid dependency on third party tools like F5 if ZK is doing the same.

Cloudera Community

Support Questions

Can Zookeeper handle Hs2 HA for ODBC connections?