Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Intermittent Keytrustee KMS WARN and job failures

avatar
Expert Contributor

I intermittently see these errors in keytrustee KMS. When i check both Keytrustee KMS servers are up and running without any issues. (Hadoopkey list, get metadata for key from each kms server yields fine without any errors) 

 

We see these in Pig jobs intermittently and job when we rerun succeeds.

 

Can anyone throw light on this issue and next steps in troubleshooting. 

 

2019-04-14 04:00:55,380 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-04-14 04:00:55,380 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2019-04-14 04:00:56,160 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2019-04-14 04:00:56,190 [JobControl] INFO org.apache.hadoop.hdfs.DFSClient - Created token for func_svc_dig_11ent: HDFS_DELEGATION_TOKEN owner=func_abc@VSP.SAS.COM, renewer=yarn, realUser=, issueDate=1555228856156, maxDate=1555836656176, sequenceNumber=96594554, masterKeyId=1976 on ha-hdfs:nameservice1
2019-04-14 04:00:56,367 [JobControl] WARN org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:func_abc@VSP.SAS.COM (auth:KERBEROS) cause:org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Connection refused)
2019-04-14 04:00:56,370 [JobControl] WARN org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider - KMS provider at [https://cdn84au.xxx.xxx.com:16000/kms/v1/] threw an IOException:
java.io.IOException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1024)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:193)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:190)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:123)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.addDelegationTokens(LoadBalancingKMSClientProvider.java:190)
at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:110)
at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2333)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:140)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)

1 REPLY 1

avatar
Mentor
- Do you observe this intermittency from only specific client/gateway hosts?
- Does your cluster apply firewall rules between the cluster hosts?

One probable reason behind the intermittent 'Connection refused' from KMS could be that it is frequently (auto)restarting. Checkout its process stdout messages and service logs to confirm if there's a kill causing it to be restarted by the CM Agent supervisor.