Support Questions
Find answers, ask questions, and share your expertise

Unable to connect to secured hadoop of HDP cluster from standalone spark job.

Unable to connect to secured hadoop of HDP cluster from standalone spark job.

New Contributor

I am using Spark standalone 1.6.x version to connect kerberos enabled hadoop 2.7.x in HDP cluster.

The scenario works okay if i run the code in spark local mode.Facing below issue only in cluster and client mode

JavaDStream<String> status = stream.map(new Function<String, String>() {

  public String call(String arg0) throws Exception {
       Configuration conf = new Configuration();
       FileSystem fs = null;

       conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");
       conf.set("hadoop.security.authentication", "kerberos");
       conf.set("dfs.namenode.kerberos.principal", "hdfs/_HOST@REALM");
       UserGroupInformation.setConfiguration(conf);
 
       UserGroupInformation.setLoginUser(UserGroupInformation.loginUserFromKeytabAndReturnUGI("abc","~/abc.ketyab"));
       System.out.println("Logged in successfully.");

       fs = FileSystem.get(new URI(activeNamenodeURI), conf);
       FileStatus[] s = fs.listStatus(new Path("/"));
       for (FileStatus status : s) {
            System.out.println(status.getPath().toString());
       }
       return "success";
    }
 });

Getting below exception:

User : abc@REALM (auth:KERBEROS)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "hostname1/0.0.0.0"; destination host is: "hostname2":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy44.create(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy45.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1725)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1668)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1593)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at com.abc.HDFSFileWriter.createOutputFile(HDFSFileWriter.java:354)
... 21 more
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:643)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:730)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 43 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:553)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:368)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:722)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:718)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)

4 REPLIES 4

Re: Unable to connect to secured hadoop of HDP cluster from standalone spark job.

Super Collaborator

do you have a valid ticket or keytab on the machine running the spark job? Try a klist to verify it before submitting the spark job.

Another possible root cause is the service principal that is using _HOST, which is only supported by Hadoop, but not Spark itself, so a standalone Spark installation will not resolve it. You could try the correct server name.

https://issues.apache.org/jira/browse/SPARK-12646

Re: Unable to connect to secured hadoop of HDP cluster from standalone spark job.

New Contributor

Thanks for you reply Harald.

I am able to kinit with the provided keytab and principal.But I am using following code to get a ticket in the machine where spark executor is running.

UserGroupInformation.setLoginUser(UserGroupInformation.loginUserFromKeytabAndReturnUGI("abc","~/abc.ketyab"));

Also, as per your suggestion, I replaced _HOST with active namenode hostname, but still getting the same error as above.

Re: Unable to connect to secured hadoop of HDP cluster from standalone spark job.

New Contributor

@apekagnihotri I know this is an old thread, but were you able to find a solution to this problem?
I have this same error and if you could mention what was the cause/fix to this problem it'll be really helpful

Re: Unable to connect to secured hadoop of HDP cluster from standalone spark job.

Community Manager

Hi @Prado, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.


Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community: