Support Questions

Find answers, ask questions, and share your expertise

About Load Balancer's SPoF

avatar

Hello Team,

 

I am considering HA configuration of Kerberized CDH cluster.

 

Is it possible to avoid SPoF of Load Balancer when constructing Load Balancer such as HAProxy and configuring HA of HiveServer2?
For example, is it possible to deploy two Load Balancer with Active-Active and specify two host and port for setting(HiveServer2 Load Balancer property)?
Or, should deploy Active and Stanby of Load Balancer be constructed, and when Active fails, should I change the setting(HiveServer2 Load Balancer property) to refer to Stanby?

Impalad thinks the same way.

 

I refer to the following procedure.

 * https://www.cloudera.com/documentation/enterprise/6/6.0/topics/admin_ha_hiveserver2.html#concept_u4b...
 * https://www.cloudera.com/documentation/enterprise/6/6.0/topics/impala_proxy.html#proxy

 

thank you for reading.

 

2 ACCEPTED SOLUTIONS

avatar
Super Guru
Currently CM only supports one LB setting in both Hive and Impala configuration, your idea of two active LBs is not supported.

When LB fails, you can always connect directly to each HS2 if needed.

Hope above helps.

View solution in original post

avatar
Super Guru
It is true for Impala, but not in HS2. I have requested our DOC team to update it.

After HA enabled for HS2, you can still connect to HS2 instance directly, the only thing to be sure is that you need to use the LB's principal, not HS2 host's principal.

Hope that helps.

View solution in original post

4 REPLIES 4

avatar
Super Guru
Currently CM only supports one LB setting in both Hive and Impala configuration, your idea of two active LBs is not supported.

When LB fails, you can always connect directly to each HS2 if needed.

Hope above helps.

avatar

Thank you for your reply.

 

I understood about support for only one LB setting.

 

Please make sure that you can connect directly to HS2.

 

The following description in the procedure states that in a Kerberized CDH cluster, direct connection to HS2 fails, but is it possible to connect?
In the case of impalad, can you connect without using impala-shell? Can ODBC / JDBC and other connections be possible?
 * https://www.cloudera.com/documentation/enterprise/6/6.0/topics/admin_ha_hiveserver2.html#concept_c1m...
  * ・Warning:
   * ・On Kerberos-enabled clusters, you must use the load balancer for all connections. After you enable HiveServer2 high availability, direct connections to HiveServer2 instances fail.
 * https://www.cloudera.com/documentation/enterprise/6/6.0/topics/impala_proxy.html#proxy_kerberos
  * Once you enable a proxy server in a Kerberized cluster, users will not be able to connect to individual impala daemons directly from impala-shell.

avatar
Super Guru
It is true for Impala, but not in HS2. I have requested our DOC team to update it.

After HA enabled for HS2, you can still connect to HS2 instance directly, the only thing to be sure is that you need to use the LB's principal, not HS2 host's principal.

Hope that helps.

avatar

Thank you for your quick reply.

 

In the Kerberized environment, when LB was set for HS2 and Impalad, I understood that I can connect directly to HS2, but I can not connect directly to Impalad.

 

Thank you for responding to DOC correction. Because I often refer to DOC, I help a lot.