Support Questions
Find answers, ask questions, and share your expertise

After enable the kerberos ,the spark does not work

Contributor

when i use spark-submit to run some code, the spark does not work, the error follows:

Traceback (most recent call last):
  File "/home/lizhen/test.py", line 27, in <module>
  abc = raw_data.count()
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1006, in count
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 997, in sum
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 871, in fold
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 773, in collect
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/usr/hdp/2.3.4.0-3485/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.io.IOException: java.net.ConnectException: Connection refused
   at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:888)
   at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:86)
   at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2243)
   at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
   at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
   at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
   at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206)
   at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
   at scala.Option.getOrElse(Option.scala:120)
   at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
   at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
   at scala.Option.getOrElse(Option.scala:120)
   at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
   at org.apache.spark.api.python.PythonRDD.getPartitions(PythonRDD.scala:58)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
   at scala.Option.getOrElse(Option.scala:120)
   at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
   at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
   at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
   at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
   at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
   at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
   at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
   at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:497)
   at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
   at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
   at py4j.Gateway.invoke(Gateway.java:259)
   at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
   at py4j.commands.CallCommand.execute(CallCommand.java:79)
   at py4j.GatewayConnection.run(GatewayConnection.java:207)
   at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
   at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
   at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
   at java.net.Socket.connect(Socket.java:589)
   at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
   at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
   at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
   at sun.net.www.http.HttpClient.New(HttpClient.java:308)
   at sun.net.www.http.HttpClient.New(HttpClient.java:326)
   at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
   at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
   at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
   at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
   at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:190)
   at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
   at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
   at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
   at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
   at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
   at org.apache.hadoop.crypto.key.kms.KMSClientProvider$2.run(KMSClientProvider.java:875)
   at org.apache.hadoop.crypto.key.kms.KMSClientProvider$2.run(KMSClientProvider.java:870)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:870)
   ... 41 more
1 ACCEPTED SOLUTION

Accepted Solutions

Connection refused invariably means "there's no service at the destination". Here I'd assume that either the configuration of the KMS is wrong (it's URL), or its not currently running.

View solution in original post

4 REPLIES 4

Contributor

when i test use the mapreduce example, the error also appear

Do your service checks (Spark, HDFS, Yarn, Mapred etc) work? If they do, have you acquired a ticket, what does "klist" say? If klist lists nothing you have to acquire a ticket using kinit, either as an end-user, or as spark or hdfs service user. Try first to list hdfs: "hdfs dfs -ls /", does it work?

Connection refused invariably means "there's no service at the destination". Here I'd assume that either the configuration of the KMS is wrong (it's URL), or its not currently running.

View solution in original post

Contributor

yes,you are right. after i install the kms service ,it works!