Created on 10-08-2018 12:08 AM - edited 09-16-2022 06:47 AM
Hello Team,
We have CDH 5.15 cluster running and have kerberos and TLS enabled for all services in the cluster.
We would like to enable for Hiveserver2 using haproxy load balancer.
We have enable HA for hivemetastore using below link. 2 instance of hive metastore is up and running.
https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hivemetastore.html
Refering below link for hiveserver2 ha.
https://www.cloudera.com/documentation/enterprise/5-15-x/topics/admin_ha_hiveserver2.html
haproxy, 1 instance of hive metastore, 1 instance of hiveserver2 installed on same node.
beeline throws below error.
beeline> !connect jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Connecting to jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxx;principal=hive/aabc@REALM
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=xxxxxx;principal=hive/aabc@REALM: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)
Below snap for haproxy config
# This is the setup for HS2. beeline client connect to load_balancer_host:10001.
# HAProxy will balance connections among the list of servers listed below.
listen hiveserver2 :10001
mode tcp
option tcplog
balance source
server hiveserver2_1 abc:10000
server hiveserver2_2 xyz:10000
Kindly suggest?
- Vijay M
Created 11-19-2018 08:08 AM
> java.security.cert.CertificateException: No subject alternative DNS name matching abc found.
Hi,
This error is important to note, as it would appear to mean that a certificate is now vailable to the client. The balancing algorithim really has no bearing on this particular issue and you must address this issue. By RFC standard if you use Subject Alt Names (SAN) and a CN the very first entry in the DNS Alt Name field must be the CN of the certificate. The error tells us that abc is not the first entry in DNS Alt Names (SAN).
You need to review the CN and Subject/DNS Alt Names on your certificates in use by Hiveserver 2.
Created 10-08-2018 10:03 AM
We see by the following error that the failure occurred during the TLS handshake:
javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)
In this case, it is probably that the server did not understand the connection sent to it. In order to debug further, you can examine the logs of your HAProxy and also the HiveServer2 instance that you connected to.
I would also suggest testing without the HAProxy (connect directly with beeline to each of the HS2 instances and see if you can connect. This will help isolate whether to look more closely at HiveServer2 or the HAProxy.
If you know tcpdump, it is perfect for debugging TLS handshake problems since it lets you see all the handshake communication. Wireshark can decode the packets and display the handshake nicely. If that is not something you know well, let's hit the logs first.
Created 10-16-2018 08:35 AM
I have TLS enabled hiveserver2 with 2 instance running on 2 different hosts.
haproxy installed and configured on same server where 1 hive instance running.
Kindly confirm below.
1. DO i need to define TLS cert anywhere in haproxy config, If yes any documentation for it?
2. Does haproxy also needs to be configured with TLS?
Any documentation for installing and conifuring load balancer for TLS enabled hiveserver2.
- VIjay Mishra
Created 10-17-2018 11:04 AM
If you are using TLS passthrough, then you don't need to configure certificates fo HAProxy as the TLS handshake is done with the HS2 servers themselves. This does add some extra work for you, though, as it means that you need to be sure that the hostname(s) in the HS2 server certificates match the name of your HAProxy host.
This can be done in a few ways, such as issuing a server certificate that contains SubjectAltName value equal to the HAProxy host's fully-qualified domain name or you could use a wildcard that matches the domain.
If you are using TLS termination where the client will do the TLS handshake with HAProxy and then can either do TLS or non-TLS connections to backend servers. In this case, HAProxy will decrypt the incoming request and then re-encrypt it if your HS2 servers are listening on TLS ports.
In that case, you do have to specify a server certificate for HAProxy's frontend and you need to use a trust store to trust the signer of the HS2 certificates.
There is information out there, but this page (dispite a few mistakes) is pretty good talking about each:
https://serversforhackers.com/c/using-ssl-certificates-with-haproxy
An example of pass-through is one I'm using on my server:
frontend hiveserver2_front
bind *:10015 ssl crt /etc/cdep-ssl-conf/CA_STANDARD/cert_key.pem
mode tcp
option tcplog
default_backend hiveserver2
backend hiveserver2
balance source
mode tcp
server hiveserver2_1 tls12-1.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
server hiveserver2_2 tls12-4.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
server hiveserver2_3 tls12-2.example.com:10000 ssl ca-file /etc/cdep-ssl-conf/CA_STANDARD/truststore.pem
NOTE: in the above, I have mode tcp set which means I'm using passthrough (no http header evaluation and therefore no need to decrypt)
Since I have server and truststore files configured, though, I could switch to mode http and do termination at the HAProxy.
I'm no HAProxy expert, but I am pretty sure the above should help you.
Created 11-01-2018 04:35 AM
I was engaged in some other projects so unable to reply on it. Started working on it today.
When iam connecting both hiveserver2 instance without haproxy, removed the load balance entry from hive confiugration.
I am able to connect to both hiveserver2 instance from beeline.
I have hiveserver2 TLS enable using CA signed certificates. and Hiveserver2 certificates are in Java format i.e. .jks(keystore.jks and truststore.jks).
In my haproxy configuration at bind line i am giving keystore.jks entry and for backend entry i am giving truststore.jks entry for both server.
Kindly confirm is it correct or suggest?
- VIjay M
Created 11-06-2018 03:27 PM
Hi @VijayM,
Without seeing the configuration you have, it is hard to say what is correct. Perhaps you can share and we can see if there is something obvois. I would strongly suggest looking at the HAProxy logs an the HiveServer2 logs when the problem happens to look for any TLS errors or related messages.
Created 11-06-2018 08:12 PM
Created 11-13-2018 12:45 AM
Below find details of certificates which i have on cluster.
below certificate is from root CA
-rwxr-xr-x. 1 cloudera-scm cloudera-scm 8152 Oct 5 10:36 cacerts.pem
Below certificate are keystore and trustore used by Hive service TLS enabled.
-rwxr-xr-x. 1 cloudera-scm cloudera-scm 9624 Oct 5 10:38 cloudera_keystore.jks
-rwxr-xr-x. 1 cloudera-scm cloudera-scm 4048 Oct 5 10:39 cloudera_truststore.jks
Below find configuration of haproxy
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend hiveserver2_front
bind *:443
option tcplog
mode tcp
default_backend hiveserver2
# This is the setup for HS2. beeline client connect to load_balancer_host:10001.
# HAProxy will balance connections among the list of servers listed below.
backend hiveserver2
mode tcp
balance source
option ssl-hello-chk
server hiveserver2_1 abc:10000 check
server hiveserver2_2 xyz:10000 check
--- hive server2 configuration from cloudera manager configured with below property
HiveServer2 Load Balancer abc:443
1. Kindly confirm does in above property do i have to add https or http? Is it require?
2. Kindly review the configuration and let me know if anything more details require?
3. one of hiveserver2 instance and haproxy services configured on same server i.e. abc, Is it an issue?
Kindly suggest?
- Vijay M
Created 11-14-2018 12:37 PM
Based on your original message and your configuration, I think the HAProxy bind port is the issue.
You have:
bind *:443
But you are trying to connect via TLS to port 10001
Maybe try:
bind *:10001
Then restart HAProxy.
Hope it is that simple. If that doesn't work, let us know and we can use openssl s_client to observe the handshake to see what happens.
Created 11-14-2018 09:54 PM
Same issue with port 10001 or port 443.
Below snap confirms haproxy started and running on port 10001.
[root@abc ~]# ps -ef | grep -i haproxy
root 2620129 1 0 06:19 ? 00:00:00 /usr/sbin/haproxy-systemd-wrapper -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy 2620130 2620129 0 06:19 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
haproxy 2620131 2620130 0 06:19 ? 00:00:00 /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -Ds
[root@abc ~]# netstat -tunlp | grep 10001
tcp 0 0 0.0.0.0:10001 0.0.0.0:* LISTEN 2620131/haproxy
[root@abc ~]#
Below are 2 scenarios which i tried and explained. Kindly check and suggest to fix.
Case1:
When i removed haproxy load balancer porperty from hive configuration and trying to connect individual haproxy services through beeline. i am able to connect. Below snap for the same.
beeline> !connect jdbc:hive2://abc:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Connecting to jdbc:hive2://abc:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Connected to: Apache Hive (version 1.1.0-cdh5.15.1)
Driver: Hive JDBC (version 1.1.0-cdh5.15.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://a301-8883-0447.gdzd.ubs.net:1>
beeline> !connect jdbc:hive2://xyz:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
scan complete in 2ms
Connecting to jdbc:hive2://xyz:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@BDS-DR.UBS.COM
Connected to: Apache Hive (version 1.1.0-cdh5.15.1)
Driver: Hive JDBC (version 1.1.0-cdh5.15.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://a301-8883-2675.gdzd.ubs.net:1>
Case2:
With haproxy load balancer property in hive configuration with port 10001 configured in haproxy configuration its not working and throws error.
beeline> !connect jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Connecting to jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=Cldkeystore1;principal=hive/_HOST@BDS-DR.UBS.COM
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://abc:10001/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM: javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake (state=08S01,code=0)
beeline>
No logs of both hiveserver2 instance and haproxy gets updated for above error.
with above scnario when i am trying to connect individual hiveserver2 instances i am able to connect to hiveserver2 instance on haproxy running but unable to connect to other hiveserver2 instance and gets TLS error.
Below snap for both.
-- Successfully able to connect to hiveserrver2 where haproxy also running.
beeline> !connect jdbc:hive2://abc:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Connecting to jdbc:hive2://abc:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Connected to: Apache Hive (version 1.1.0-cdh5.15.1)
Driver: Hive JDBC (version 1.1.0-cdh5.15.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://a301-8883-0447.gdzd.ubs.net:1>
--- Unable to connect to other hiveserver2 instance
beeline> !connect jdbc:hive2://xyz:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
scan complete in 2ms
Connecting to jdbc:hive2://xyz:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://xyz:10000/default;ssl=true;sslTrustStore=/app/bds/security/pki/cloudera_truststore.jks;sslTrustPassword=*****;principal=hive/_HOST@REALM: Peer indicated failure: GSS initiate failed (state=08S01,code=0)
beeline>
hiveserver2 log shows below error.
2018-11-15 06:46:08,217 ERROR org.apache.thrift.transport.TSaslTransport: [HiveServer2-Handler-Pool: Thread-40]: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)]
at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:199)
Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism level: Checksum failed)
at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:856)
at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:167)
... 14 more
Caused by: KrbException: Checksum failed
2018-11-15 06:46:08,220 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-40]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: GSS initiate failed
Caused by: org.apache.thrift.transport.TTransportException: GSS initiate failed
- Vijay M