Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to resolve 'SASL negotiation failure' of sometimes execute hive2 action in secure 5.7 cluster

avatar
Expert Contributor

HI

We work with CDH 5.7 secure cluster. 

we run hive2 action with oozie.

1.    We find the below logs in hive server sometimes.

______________________

2016-08-06 00:09:06,778 ERROR org.apache.thrift.transport.TSaslTransport: [HiveServer2-Handler-Pool: Thread-52]: SASL negotiation failure
javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=xxx, renewer=hive, realUser=hive/xxx.idc1.xx@XXX, issueDate=1470413336969, maxDate=1471018136969, sequenceNumber=9, masterKeyId=2]
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:594)
at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244)
at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=xxx, renewer=hive, realUser=hive/xxx.idc1.xx@XXX, issueDate=1470413336969, maxDate=1471018136969, sequenceNumber=9, masterKeyId=2
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:114)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:588)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:619)
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:585)
... 15 more
2016-08-06 00:09:06,779 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-52]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: DIGEST-MD5: IO error acquiring password
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: DIGEST-MD5: IO error acquiring password
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 10 more

______________________________________

 

2. I find hive2 action task sometimes failed and more time successed. 

    2.1 the success logs is below.

           ________________________________________

Connecting to jdbc:hive2://xxxx.idc1.xxx:10000/
Error: Could not open client transport with JDBC Uri: jdbc:hive2://xxxx.idc1.xxx:10000/: Peer indicated failure: DIGEST-MD5: IO error acquiring password (state=08S01,code=0)
Connected to: Apache Hive (version 1.1.0-cdh5.7.1)
Driver: Hive JDBC (version 1.1.0-cdh5.7.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
No rows affected (0.078 seconds)
INFO : Compiling command(queryId=hive_20160808000909_06e0d60a-7dcd-485f-b177-f83aced6ee9b): use xxx

            _____________________________________

 

2.2 the failed log is below, and it happend sometimes.

        Connecting to jdbc:hive2://xxxx.idc1.xxx:10000/
Error: Could not open client transport with JDBC Uri: jdbc:hive2://xxxx.idc1.xxx:10000/: Peer indicated failure: DIGEST-MD5: IO error acquiring password (state=08S01,code=0)
No current connection
Connected to: Apache Hive (version 1.1.0-cdh5.7.1)
Driver: Hive JDBC (version 1.1.0-cdh5.7.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Closing: 0: jdbc:hive2://xxxx.idc1.xxx:10000/
Intercepting System.exit(2)

__________________________________________________

 

could you help me to resolve this issue that sometimes the hive2 action failed?

 

thanks in advance.

 

BR

Paul

 

11 REPLIES 11

avatar
Expert Contributor

HI

could you help me to look into this issue. It let our jobs failed sometimes.

 

Why the connection was closed sometimes?

 

BR

Paul

avatar
Expert Contributor

HI

I worked on this issue about three days, 

I try to find the logs, and change the heap size to 6G of hive server2 and 8G of hive metastore.

the behaivor still happens. I just run 12 oozie jobs, these is may be 3 jobs concurrency.

 

I repeat the issue:
when I work on hive2 action with oozie, sometimes the hiveserver cannot connect to hive metastore.

 

Who can show me the way?

 

BR

Paul

 

 

avatar
Rising Star

I am facing the same issue as well. How are you passing the JDBC URL to Oozie?

avatar
Expert Contributor

HI

we pass the jdbc url as below:

****************
	<credentials>
        <credential name="hs2-creds" type="hive2">
            <property>
                <name>hive2.server.principal</name>
                <value>${jdbcPrincipal}</value>
            </property>
            <property>
                <name>hive2.jdbc.url</name>
                <value>${jdbcURL}</value>
            </property>
        </credential>
    </credentials>
****************
<action name="drop-hive-partion"  cred="hs2-creds">
        <hive2 xmlns="uri:oozie:hive2-action:0.2">
            <jdbc-url>${jdbcURL}</jdbc-url>
            <script>drop_partion.hql</script>
			<param>dateStr=${dateStr}</param> 
			<param>database_name=${database_name}</param>
			<param>table_name=${table_name}</param>
        </hive2>
        <ok to="merge-file"/>
        <error to="fail"/>
	</action>

Maybe this is the problem when we use the HA hiveserver2, and HA hiveMetaStore. After, go to one hiveserver, one hiveMetastore, the issue gone. But I dont know if this is the real reason.

avatar
Rising Star

I was able to resolve this issue by using HS2 on DataNodes. I don't have HMS HA but I do have multiple Hive Servers behind a load balancer. 

The problem arose when I moved the HS2 roles to application nodes (non-Datanodes) but when I moved them back to the DataNodes, these errors swiftly disappeared. 

avatar
Rising Star

I looked in a little more depth on this issue. Here are the findings:

 

1. When I use 1 HS2 behind the LoadBalancer, it succeeds without any error

2. When I use 2 HS2 behind the LoadBalancer, I see one error but it retries and succeeds

3. When I use 3 or more than 3 HS2 behind the LoadBalancer, it errors out.

 

I have tried turning the sticky connections for TCP inside the LoadBalancer but to no avail. Looks like there's some sort of Kerberos handshake for which it comes back to the LoadBalancer. Here, it'll then hit another server and then fail. 

avatar

Hi,

 

I am using the CDH5.8.0 and getting the same error when running hiveserver2 behind the load balancer, When running through beeline it is working fine but getting error when running through oozie,below are the logs:

 

2016-12-30 12:11:47,213 INFO org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-40]: Session disconnected without closing properly, close it now
2016-12-30 12:11:47,273 INFO hive.metastore: [HiveServer2-Handler-Pool: Thread-40]: Closed a connection to metastore, current connections: 1
2016-12-30 12:48:14,954 ERROR org.apache.thrift.transport.TSaslTransport: [HiveServer2-Handler-Pool: Thread-50]: SASL negotiation failure
javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=hyadav, renewer=hive, realUser=hive/ttamaster02@TTA, issueDate=1483102077222, maxDate=1514638077222, sequenceNumber=1, masterKeyId=1]
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:598)
at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:244)
at org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=hyadav, renewer=hive, realUser=hive/ttamaster02@TTA, issueDate=1483102077222, maxDate=1514638077222, sequenceNumber=1, masterKeyId=1
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:114)
at org.apache.hadoop.hive.thrift.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java:56)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java:588)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java:619)
at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:589)
... 15 more
2016-12-30 12:48:14,956 ERROR org.apache.thrift.server.TThreadPoolServer: [HiveServer2-Handler-Pool: Thread-50]: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: DIGEST-MD5: IO error acquiring password
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:765)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:762)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:762)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TTransportException: DIGEST-MD5: IO error acquiring password
at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316)
at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
... 10 more
2017-01-02 07:22:06,167 WARN org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-51]: Error getting tables:
org.apache.hive.service.cli.HiveSQLException: Invalid SessionHandle: SessionHandle [ada90085-134f-42f7-a3b8-357ff30bad22]
at org.apache.hive.service.cli.session.SessionManager.getSession(SessionManager.java:346)
at org.apache.hive.service.cli.CLIService.getTables(CLIService.java:309)
at org.apache.hive.service.cli.thrift.ThriftCLIService.GetTables(ThriftCLIService.java:562)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetTables.getResult(TCLIService.java:1393)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetTables.getResult(TCLIService.java:1378)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:718)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2017-01-02 07:22:06,171 INFO org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-51]: Client protocol version: HIVE_CLI_SERVICE_PROTOCOL_V7

 

Can anyone suggest whats the cause of the issue and how to fix it.

avatar
New Contributor

Hi,

did you find a solution?

I am experiencing the same issue in two CDH 5.6.1 clusters with HS2 HA+SSL and no HMS HA.

 

Basically HiveServer2 Oozie actions login to HiveServer2 via beeline and auth themself via delegation tokens pre-acquired by the Oozie server and passed in the YARN job container. If it happens that the HiveServer2 used (via the load balancer) by the Oozie Server is different from the one used by the Oozie HiveAction2 job the error fires saying the delegation token is not valid.

I only know that I have to store Hive Metastore Server delegation tokens in the metastore DB if you have multiple HMS but this is not my case because I only have one HMS (I had two in the past but I had to move back to one due to Sentry ACL Sync requirements).

 

I am also submitting the Oozie job via a proxy user with the doAs option but I don't think it would matter because the delegation token is correctly requested by the Oozie server for the doAs user.

 

Any help from Cloudera?

 

Regards,

Stefano

avatar
Expert Contributor

this iuuse still happened on CDH6.2。

currently i restart hive server to reslove this issue。

 

Cloudera suporter,how to reslove this iuuse while open kerberos?

 

really urgent, thanks