Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kerberos error while authenticating with KMS endpoint in pyspark

avatar
Expert Contributor

Hello,

I am getting an error after upgrade of CDH5.16 to CDP7.1.7. The logs show it is unable to connect to the KMS endpoint. First start a Spark session in sparkmagic then I run below example pyspark code:

 

Starting Spark application
ID YARN Application ID Kind State Spark UI Driver log Current session?
23application_1639802810085_6070pysparkidleLinkLink
 
SparkSession available as 'spark'.

###############sample pyspark code###################

from pyspark.sql import SparkSession

spark = SparkSession.builder.master('local').getOrCreate()

# load data from .csv file in HDFS
# tips = spark.read.csv("/user/hive/warehouse/tips/", header=True, inferSchema=True)

# OR load data from table in Hive metastore
tips = spark.table('db1.table1')

from pyspark.sql.functions import col, lit, mean

# query using DataFrame API
#tips \

# query using SQL
spark.sql("select name from db1.table1").show(3)


spark.stop()

 

 

An error occurred while calling o85.showString.
: java.io.IOException: java.lang.reflect.UndeclaredThrowableException
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1051)
	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:255)
	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:252)
	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
	at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.getDelegationToken(LoadBalancingKMSClientProvider.java:252)

Caused by: java.lang.reflect.UndeclaredThrowableException
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1916)
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1029)
	... 67 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Error while authenticating with endpoint: http://kmshostxyz.com:16000/kms/v1/?op=GETDELEGATIONTOKEN&renewer=yarn%2Fyarnhost%40KERBEROSREALM
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:237)


at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
	... 68 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:365)
	at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
	... 78 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
	at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)

 

	... 79 more

Traceback (most recent call last):
  File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 381, in show
    print(self._jdf.showString(n, 20, vertical))
  File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
    return f(*a, **kw)
  File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o85.showString.
: java.io.IOException: java.lang.reflect.UndeclaredThrowableException
	at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1051)

 

There is also an old related thread but no resolution:

https://community.cloudera.com/t5/Support-Questions/Not-able-to-access-the-files-in-HDFS-encryption-...

2 REPLIES 2

avatar
Master Guru

@ebeb Are you able to kinit from the host with the existed keytab? That’s a vlid check to start with and then see where is the issue. 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Expert Contributor
Hello @GangWar ,
Yes I can do kinit -kt for all the userids including yarn, livy and own userid from the same server.