Created on 06-27-2019 01:52 AM - edited 09-16-2022 07:28 AM
When upgrading from CDH 6.1 to 6.2 the following SparkPi example started failing with following error:
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "<host>/ip"; destination host is: "<host>":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:808) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1503) at org.apache.hadoop.ipc.Client.call(Client.java:1445) at org.apache.hadoop.ipc.Client.call(Client.java:1355) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:875) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1624) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1495) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1492) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1507) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$prepareLocalResources$4.apply(ApplicationMaster.scala:209) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$prepareLocalResources$4.apply(ApplicationMaster.scala:206)
here is the Spark example job i ran:
# kinit'ed as user johndoe spark-submit --master yarn --deploy-mode client --proxy-user johndoe --conf spark.driver.extraJavaOptions=-noverify --conf spark.executor.extraJavaOptions=-noverify --conf spark.yarn.queue=default --driver-memory 2048m --num-executors 2 --executor-memory 2g --executor-cores 1 --class org.apache.spark.examples.SparkPi /opt/cloudera/parcels/CDH/lib/spark/examples/jars/spark-examples_2.11-2.4.0-cdh6.2.0.jar 10
When i remove `--proxy-user johndoe` the app completes successfully.
Any advice how to overcome this issue?
Note that when I download the Apache Spark 2.4 from the official Spark website - the above command runs successfully.