Support Questions

Find answers, ask questions, and share your expertise

Connection failed on HIVE server

avatar
Contributor

HI,

We are getting a alert on 2 of our HIVE server:

Koffi_0-1642546002480.png

With the following fail log on ambari:

Connection failed on host xx-xxx-x1-xx03.xxxxx.xx:10000 (Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/HIVE/package/alerts/alert_hive_thrift_port.py", line 204, in execute
    ldap_password=ldap_password)
  File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/hive_check.py", line 89, in check_thrift_port_sasl
    timeout_kill_strategy=TerminateStrategy.KILL_PROCESS_TREE,
  File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
    self.env.run()
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
    returns=self.resource.returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
    result = function(command, **kwargs)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
    tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
    raise ExecuteTimeoutException(err_msg)
ExecuteTimeoutException: Execution of 'ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib/jvm/java-1.8.0/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ppptlbs/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/lib/hive/bin/:/usr/sbin/'"'"' ; ! (beeline -u '"'"'jdbc:hive2://xx-xxx-x1-xx03.xxxxx.xx:10000/;transportMode=binary;ssl=true;sslTrustStore=/XXX/XXX/XXX/XXX/xxx-xxx.xxx.xx.jks;trustStorePassword= <PWD> ;principal=hive/xxx-xxx.xxx.xx@XXXX.XXXXX.XX'"'"' -n hive -e '"'"';'"'"' 2>&1 | awk '"'"'{print}'"'"' | grep -vz -i -e '"'"'Connected to:'"'"' -e '"'"'Transaction isolation:'"'"' -e '"'"'inactive HS2 instance; use service discovery'"'"')'' was killed due timeout after 120 seconds
)

 

We have also the following  log on hiveserver2.log

2022-01-18T16:56:43,875 ERROR [HiveServer2-Handler-Pool: Thread-169]: server.TThreadPoolServer (:()) - Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-
        at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-
        at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_222]
        at javax.security.auth.Subject.doAs(Subject.java:360) ~[?:1.8.0_222]
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) ~[hadoop-common-3.1.1.3.0.1.0-187.jar:?]
        at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.387]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_222]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_222]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        ... 10 more
Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_unknown
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) ~[?:1.8.0_222]
        at sun.security.ssl.Alerts.getSSLException(Alerts.java:154) ~[?:1.8.0_222]
        at sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2020) ~[?:1.8.0_222]
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1127) ~[?:1.8.0_222]
        at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1367) ~[?:1.8.0_222]
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:931) ~[?:1.8.0_222]
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) ~[?:1.8.0_222]
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_222]
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_222]
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_222]
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) ~[hive-exec-3.1.0.3.0.1.0-187.jar:3.1.0.3.0.1.0-187]
        ... 10 more

Could you please help us solve this issue?

 

Thank you

2 REPLIES 2

avatar
Guru

@Koffi  Do you ovserve Hiveserver2 going down or you see the false alerts in Ambari ?

 

Can you please check ps -ef | grep hiveserver2 ===> To check if HS2 is up.

 

The connection between ambari-agent and Hivesrerver2 broke as per the logs due to SSL connectivity.

avatar
Expert Contributor

Did u see these alerts once you enabled SSL in the cluster?