Reply
New Contributor
Posts: 2
Registered: ‎10-18-2018

Cloudera - Kerberos GSS initiate failed (Even when valid ticket is available in the application)

[ Edited ]
Problem statement: Facing “GSS Exception” in our application while submitting MR/Spark jobs.
 
GSS Exception No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)  in our application.
 
Application Content:  We have a Java application which launches spark and mapreduce jobs in both local mode(setting spark.master=local and mapreduce.framework.name=local) and distributed mode. For certain use cases we also call org.apache.hadoop.mapred.InputFormat class getSplits method explicitly based on the source data.For eg. for Hcatalog source we call HCatInputFormat's getSplits method. This works fine but intermittently we face GSS exception in calling getSplits method on Kerberos enabled cluster.
 
We are creating new UGI before launching any job by following code :
 
    UserGroupInformation kerberosUGI = UserGroupInformation
2.                                 .loginUserFromKeytabAndReturnUGI(Principalname, keytabPath);
3.                                
 
6.   Code for launching job
 
7.  kerberosUGI.doAs(new PrivilegedExceptionAction<void>()
8.  {
9.       launch job
10.     
11. }
 
We also relogin-ed the user by calling org.apache.hadoop.security.UserGroupInformation reloginFromKeytab method to deal with expired tickets.
We have not been able to identify the scenario when it occurs but we have one observation that it occurs mostly when the application is up and running for more than 3 days.
 
 
 
Logs :
 
Tue, 16 Oct 2018 15:31:41,102 IST [dag-scheduler-event-loop]:ERROR:1184:ERROR:java.lang.RuntimeException: serious problem
2. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
3. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
4. at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:152)
5. ...
6. at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)
7. at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
8. at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
9. .....
10. Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "intelli-i0048.kyvostest.com/172.26.43.28"; destination host is: "intelli-i0044.kyvostest.com":8020;
11. at java.util.concurrent.FutureTask.report(FutureTask.java:122)
12. at java.util.concurrent.FutureTask.get(FutureTask.java:192)
13. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
14. ... 49 more
15. Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "intelli-i0048.kyvostest.com/172.26.43.28"; destination host is: "intelli-i0044.kyvostest.com":8020;
16. at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
17. at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
18. at org.apache.hadoop.ipc.Client.call(Client.java:1498)
19. at org.apache.hadoop.ipc.Client.call(Client.java:1398)
20. at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
21. at com.sun.proxy.$Proxy17.getListing(Unknown Source)
22. at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:625)
23. at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
24. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
25. at java.lang.reflect.Method.invoke(Method.java:498)
26. at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
27. at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
28. at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
29. at com.sun.proxy.$Proxy18.getListing(Unknown Source)
30. at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
31. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1076)
32. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1059)
33. at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004)
34. at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000)
35. at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
36. at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1000)
37. at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735)
38. at org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:667)
39. at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:361)
40. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
41. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)
42. at java.util.concurrent.FutureTask.run(FutureTask.java:266)
43. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
44. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
45. at java.lang.Thread.run(Thread.java:745)
46. Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
47. at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:720)
48. at java.security.AccessController.doPrivileged(Native Method)
49. at javax.security.auth.Subject.doAs(Subject.java:422)
50. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
51. at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683)
52. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770)
53. at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
54. at org.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
55. at org.apache.hadoop.ipc.Client.call(Client.java:1451)
56. ... 27 more
57. Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
58. at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
59. at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
60. at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
61. at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
62. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
63. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
64. at java.security.AccessController.doPrivileged(Native Method)
65. at javax.security.auth.Subject.doAs(Subject.java:422)
66. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
67. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757)
68. ... 30 more
69. Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
70. at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
71. at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
72. at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
73. at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
74. at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
75. at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
76. at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
77. ... 39 more
One more
1. DEBUG:java.lang.RuntimeException: serious problem
2. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
3. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
4. at org.apache.hive.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:152)
5. ....
6. at java.lang.Thread.run(Thread.java:745)
7. Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "test-kyvos1.mhadev.local/10.210.89.250"; destination host is: "master2.mhadev.local":8020;
8. at java.util.concurrent.FutureTask.report(FutureTask.java:122)
9. at java.util.concurrent.FutureTask.get(FutureTask.java:192)
10. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:998)
11. ... 22 more
12. Caused by: java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "test-kyvos1.mhadev.local/10.210.89.250"; destination host is: "master2.mhadev.local":8020;
13. at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:785)
14. at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
15. at org.apache.hadoop.ipc.Client.call(Client.java:1498)
16. at org.apache.hadoop.ipc.Client.call(Client.java:1398)
17. at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
18. at com.sun.proxy.$Proxy15.getListing(Unknown Source)
19. at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:625)
20. at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
21. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
22. at java.lang.reflect.Method.invoke(Method.java:497)
23. at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
24. at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
25. at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
26. at com.sun.proxy.$Proxy16.getListing(Unknown Source)
27. at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2143)
28. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1076)
29. at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1059)
30. at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1004)
31. at org.apache.hadoop.hdfs.DistributedFileSystem$23.doCall(DistributedFileSystem.java:1000)
32. at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
33. at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1018)
34. at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1735)
35. at org.apache.hadoop.hive.shims.Hadoop23Shims.listLocatedStatus(Hadoop23Shims.java:667)
36. at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:361)
37. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:634)
38. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.call(OrcInputFormat.java:620)
39. at java.util.concurrent.FutureTask.run(FutureTask.java:266)
40. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
41. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
42. ... 1 more
43. Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
44. at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:720)
45. at java.security.AccessController.doPrivileged(Native Method)
46. at javax.security.auth.Subject.doAs(Subject.java:422)
47. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
48. at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683)
49. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770)
50. at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
51. at org.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
52. at org.apache.hadoop.ipc.Client.call(Client.java:1451)
53. ... 27 more
54. Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
55. at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
56. at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:414)
57. at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
58. at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
59. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
60. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
61. at java.security.AccessController.doPrivileged(Native Method)
62. at javax.security.auth.Subject.doAs(Subject.java:422)
63. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
64. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:758)
65. ... 30 more
66. Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
67. at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
68. at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
69. at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
70. at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
71. at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
72. at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
73. at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
74. ... 39 more