Reply
Explorer
Posts: 23
Registered: ‎07-05-2018

Hiveserver2 HA using haproxy load balancing

[ Edited ]

Hello Team,

 

We have CDH 5.15 cluster running and have kerberos and TLS enabled for all services in the cluster.

 

We would like to enable for Hiveserver2 using haproxy load balancer.

 

We have enable HA for hivemetastore using below link. 2 instance of hive metastore is up and running.

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hivemetastore.html

 

Refering below link for hiveserver2 ha.

 

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hiveserver2.html

 

haproxy, 1 instance of hive metastore, 1 instance of hiveserver2 installed on same node.

 

beeline throws below error.

 

beeline> !connect jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Connecting to jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxxx;principal=hive/aabc@REALM: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)

 

 

Below snap for haproxy config

 

# This is the setup for HS2. beeline client connect to load_balancer_host:10001.
# HAProxy will balance connections among the list of servers listed below.
listen hiveserver2 :10001
mode tcp
option tcplog
balance source
server hiveserver2_1 abc:10000
server hiveserver2_2 xyz:10000

 

 

Kindly suggest?

 

 

- Vijay M

Posts: 938
Topics: 1
Kudos: 218
Solutions: 117
Registered: ‎04-22-2014

Re: Hiveserver2 HA using haproxy load balancing

@VijayM,

 

We see by the following error that the failure occurred during the TLS handshake:

javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)

 

In this case, it is probably that the server did not understand the connection sent to it.  In order to debug further, you can examine the logs of your HAProxy and also the HiveServer2 instance that you connected to.

 

I would also suggest testing without the HAProxy (connect directly with beeline to each of the HS2 instances and see if you can connect.  This will help isolate whether to look more closely at HiveServer2 or the HAProxy.

 

If you know tcpdump, it is perfect for debugging TLS handshake problems since it lets you see all the handshake communication.  Wireshark can decode the packets and display the handshake nicely. If that is not something you know well, let's hit the logs first.

New Contributor
Posts: 2
Registered: ‎09-07-2018

Re: Hiveserver2 HA using haproxy load balancing

[ Edited ]

@VijayM wrote:

Hello Team,

 

We have CDH 5.15 cluster running and have kerberos and TLS enabled for all services in the cluster.

 

We would like to enable for Hiveserver2 using haproxy load balancer.

 

We have enable HA for hivemetastore using below link. 2 instance of hive metastore is up and running.

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hivemetastore.html

 

Refering below link for hiveserver2 ha.

 

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hiveserver2.html

 

haproxy, 1 instance of hive metastore, 1 instance of hiveserver2 installed krogerfeedback on same node.

 

beeline throws below error.

 

beeline> !connect jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Connecting to jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxxx;principal=hive/aabc@REALM: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)

 

 

Below snap for haproxy config

 

# This is the setup for HS2. beeline client connect to load_balancer_host:10001.
# HAProxy will balance connections among the list of servers listed below.
listen hiveserver2 :10001
mode tcp
option tcplog
balance source
server hiveserver2_1 abc:10000
server hiveserver2_2 xyz:10000

 

 

Kindly suggest?

 

 

- Vijay M


This is getting really complicated for me, please help!

Explorer
Posts: 23
Registered: ‎07-05-2018

Re: Hiveserver2 HA using haproxy load balancing

@bgooley,

 

I have TLS enabled hiveserver2 with 2 instance running on 2 different hosts.

 

haproxy installed and configured on same server where 1 hive instance running.

 

Kindly confirm below.

 

1. DO i need to define TLS cert anywhere in haproxy config, If yes any documentation for it?

2. Does haproxy also needs to be configured with TLS?

 

Any documentation for installing and conifuring load balancer for TLS enabled hiveserver2.

 

- VIjay Mishra

Posts: 938
Topics: 1
Kudos: 218
Solutions: 117
Registered: ‎04-22-2014

Re: Hiveserver2 HA using haproxy load balancing

@VijayM,

 

 

If you are using TLS passthrough, then you don't need to configure certificates fo HAProxy as the TLS handshake is done with the HS2 servers themselves.  This does add some extra work for you, though, as it means that you need to be sure that the hostname(s) in the HS2 server certificates match the name of your HAProxy host.

 

This can be done in a few ways, such as issuing a server certificate that contains SubjectAltName value equal to the HAProxy host's fully-qualified domain name or you could use a wildcard that matches the domain.

 

If you are using TLS termination where the client will do the TLS handshake with HAProxy and then can either do TLS or non-TLS connections to backend servers.  In this case, HAProxy will decrypt the incoming request and then re-encrypt it if your HS2 servers are listening on TLS ports.

 

In that case, you do have to specify a server certificate for HAProxy's frontend and you need to use a trust store to trust the signer of the HS2 certificates.

 

There is information out there, but this page (dispite a few mistakes) is pretty good talking about each:

https://serversforhackers.com/c/using-ssl-certificates-with-haproxy

 

An example of pass-through is one I'm using on my server:

frontend hiveserver2_front
bind *:10015 ssl crt /etc/cdep-ssl-conf/CA_STANDARD/cert_key.pem
mode tcp
option tcplog
default_backend hiveserver2

 

backend hiveserver2
balance source
mode tcp
server hiveserver2_1 tls12-1.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
server hiveserver2_2 tls12-4.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
server hiveserver2_3 tls12-2.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem

 

NOTE:  in the above, I have mode tcp set which means I'm using passthrough (no http header evaluation and therefore no need to decrypt)

 

Since I have server and truststore files configured, though, I could switch to mode http and do termination at the HAProxy.

 

I'm no HAProxy expert, but I am pretty sure the above should help you.

Explorer
Posts: 23
Registered: ‎07-05-2018

Re: Hiveserver2 HA using haproxy load balancing

@bgooley,

 

I was engaged in some other projects so unable to reply on it. Started working on it today.

 

When iam connecting both hiveserver2 instance without haproxy, removed the load balance entry from hive confiugration. 

I am able to connect to both hiveserver2 instance from beeline.

 

I have hiveserver2 TLS enable using CA signed certificates. and Hiveserver2 certificates are in Java format i.e. .jks(keystore.jks and truststore.jks).

 

In my haproxy configuration at bind line i am giving keystore.jks entry and for backend entry i am giving truststore.jks entry for both server.

 

Kindly confirm is it correct or suggest?

 

- VIjay M

Posts: 938
Topics: 1
Kudos: 218
Solutions: 117
Registered: ‎04-22-2014

Re: Hiveserver2 HA using haproxy load balancing

Hi @VijayM,

 

Without seeing the configuration you have, it is hard to say what is correct.  Perhaps you can share and we can see if there is something obvois. I would strongly suggest looking at the HAProxy logs an the HiveServer2 logs when the problem happens to look for any TLS errors or related messages.

Highlighted
Explorer
Posts: 23
Registered: ‎07-05-2018

Re: Hiveserver2 HA using haproxy load balancing

@bgooley,

Haproxy log doesn't shows anything and even Hiveserver2 logs.

Will send you configuration post Monday as I am on leave.

- Vijay M
Explorer
Posts: 23
Registered: ‎07-05-2018

Re: Hiveserver2 HA using haproxy load balancing

@bgooley

 

Below find details of certificates which i have on cluster.

 

below certificate is from root CA

-rwxr-xr-x. 1 cloudera-scm cloudera-scm 8152 Oct  5 10:36 cacerts.pem

 

Below certificate are keystore and trustore used by Hive service TLS enabled.

 

-rwxr-xr-x. 1 cloudera-scm cloudera-scm 9624 Oct 5 10:38 cloudera_keystore.jks
-rwxr-xr-x. 1 cloudera-scm cloudera-scm 4048 Oct 5 10:39 cloudera_truststore.jks

 

 

Below find configuration of haproxy

 

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend hiveserver2_front
bind *:443
option tcplog
mode tcp
default_backend hiveserver2

# This is the setup for HS2. beeline client connect to load_balancer_host:10001.
# HAProxy will balance connections among the list of servers listed below.
backend hiveserver2
mode tcp
balance source
option ssl-hello-chk
server hiveserver2_1 abc:10000 check
server hiveserver2_2 xyz:10000 check

 

 

--- hive server2 configuration from cloudera manager configured with below property

 

HiveServer2 Load Balancer      abc:443

 

 

1. Kindly confirm does in above property do i have to add https  or http?  Is it require?

2. Kindly review the configuration and let me know if anything more details require?

3. one of hiveserver2 instance and haproxy services configured on same server i.e. abc, Is it an issue?

 

 

Kindly suggest?

 

- Vijay M

 

Posts: 938
Topics: 1
Kudos: 218
Solutions: 117
Registered: ‎04-22-2014

Re: Hiveserver2 HA using haproxy load balancing

@VijayM,

 

Based on your original message and your configuration, I think the HAProxy bind port is the issue.

 

You have:

 

bind *:443

 

But you are trying to connect via TLS to port 10001

 

Maybe try:

 

bind *:10001

 

Then restart HAProxy.

Hope it is that simple.  If that doesn't work, let us know and we can use openssl s_client to observe the handshake to see what happens.

Announcements