Created on 11-21-2023 09:01 AM - edited 11-21-2023 09:03 AM
I have hdp cluster 2.6.5 (kerberized) on public cloud, and i need to access hdfs from outside (public access) through hdfs cli (not webhdfs)
As the external hadoop-client host can't be enrolled through ambari i just downloaded hadoop 2.7.3 package and configured core-site.xml and hdfs-site.xml as bellow
core-silte.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://datalake-cstest9</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>datalake-cstest9</value>
</property>
<property>
<name>dfs.ha.namenodes.datalake-cstest9</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.datalake-cstest9.nn1</name>
<value>mnode0.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.datalake-cstest9.nn2</name>
<value>mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.datalake-cstest9.nn1</name>
<value>mnode0.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.datalake-cstest9.nn2</name>
<value>mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com:50070</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.datalake-cstest9</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/bin/true)</value>
</property>
</configuration>
I checked port are well configured
ping ok
ping mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com
PING mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*) 56(84) bytes of data.
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=1 ttl=49 time=89.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=2 ttl=49 time=86.9 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=3 ttl=49 time=86.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=4 ttl=49 time=87.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=5 ttl=49 time=86.9 ms
ports 8020 & 50070 are open
ping mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com
PING mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*) 56(84) bytes of data.
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=1 ttl=49 time=89.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=2 ttl=49 time=86.9 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=3 ttl=49 time=86.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=4 ttl=49 time=87.8 ms
64 bytes from mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com (51.*.*.*): icmp_seq=5 ttl=49 time=86.9 ms
When i try to list hdfs folders, i'm getting this error
hdfs dfs -ls /user
2023-11-20 16:04:27,389 WARN [main] security.UserGroupInformation (UserGroupInformation.java:hasSufficientTimeElapsed(1193)) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
2023-11-20 16:04:27,979 WARN [main] security.UserGroupInformation (UserGroupInformation.java:hasSufficientTimeElapsed(1193)) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
2023-11-20 16:04:30,240 WARN [main] security.UserGroupInformation (UserGroupInformation.java:hasSufficientTimeElapsed(1193)) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
2023-11-20 16:04:32,641 WARN [main] security.UserGroupInformation (UserGroupInformation.java:hasSufficientTimeElapsed(1193)) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
2023-11-20 16:04:33,900 WARN [main] ipc.Client (Client.java:run(678)) - Couldn't setup connection for <USER>@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM to mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/51.*.*.*:8020
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: ICMP Port Unreachable)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:375)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:729)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1657)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Caused by: GSSException: No valid credentials provided (Mechanism level: ICMP Port Unreachable)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:777)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 40 more
Caused by: java.net.PortUnreachableException: ICMP Port Unreachable
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
at java.net.DatagramSocket.receive(DatagramSocket.java:812)
at sun.security.krb5.internal.UDPClient.receive(NetClient.java:206)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:404)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.krb5.KdcComm.send(KdcComm.java:348)
at sun.security.krb5.KdcComm.sendIfPossible(KdcComm.java:253)
at sun.security.krb5.KdcComm.send(KdcComm.java:229)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbTgsReq.send(KrbTgsReq.java:221)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:236)
at sun.security.krb5.internal.CredentialsUtil.serviceCredsSingle(CredentialsUtil.java:477)
at sun.security.krb5.internal.CredentialsUtil.serviceCredsReferrals(CredentialsUtil.java:369)
at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:333)
at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:314)
at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:169)
at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:490)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
2023-11-20 16:07:56,779 WARN [main] retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(122)) - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo over mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/51.*.*.*:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup connection for <USER>@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM to mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/51.*.*.*:8020; Host Details : local host is: "vm-ubuntu.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/10.0.2.15"; destination host is: "mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1657)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:326)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:235)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:218)
at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
Caused by: java.io.IOException: Couldn't setup connection for <USER>@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM to mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/51.*.*.*:8020
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:679)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:737)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 28 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: ICMP Port Unreachable)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:560)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:375)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:729)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
... 31 more
Caused by: GSSException: No valid credentials provided (Mechanism level: ICMP Port Unreachable)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:777)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
... 40 more
Caused by: java.net.PortUnreachableException: ICMP Port Unreachable
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:143)
at java.net.DatagramSocket.receive(DatagramSocket.java:812)
at sun.security.krb5.internal.UDPClient.receive(NetClient.java:206)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:404)
at sun.security.krb5.KdcComm$KdcCommunication.run(KdcComm.java:364)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.krb5.KdcComm.send(KdcComm.java:348)
at sun.security.krb5.KdcComm.sendIfPossible(KdcComm.java:253)
at sun.security.krb5.KdcComm.send(KdcComm.java:229)
at sun.security.krb5.KdcComm.send(KdcComm.java:200)
at sun.security.krb5.KrbTgsReq.send(KrbTgsReq.java:221)
at sun.security.krb5.KrbTgsReq.sendAndGetCreds(KrbTgsReq.java:236)
at sun.security.krb5.internal.CredentialsUtil.serviceCredsSingle(CredentialsUtil.java:477)
at sun.security.krb5.internal.CredentialsUtil.serviceCredsReferrals(CredentialsUtil.java:369)
at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:333)
at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:314)
at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:169)
at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:490)
at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:695)
... 43 more
ls: Failed on local exception: java.io.IOException: Couldn't setup connection for <USER>@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM to mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/51.*.*.*:8020; Host Details : local host is: "vm-ubuntu.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com/10.0.2.15"; destination host is: "mnode1.7458907e-6f32-4f4a-b33e-6820be708ad4.datalake.com":8020;
Kerberos ticket was obtained successfully
Ticket cache: FILE:/tmp/krb5cc_1001
Default principal: <USER>@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM
Valid starting Expires Service principal
11/20/2023 15:54:26 11/21/2023 15:54:15 krbtgt/7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM@7458907E-6F32-4F4A-B33E-6820BE708AD4.DATALAKE.COM
Any ideas why can't access hdfs ? any missing informations
Created 11-23-2023 05:45 AM
Hi , Were you able to verify client to KDC server connectivity for networking/firewall issue ? You could run using below option to see more details.
-Dsun.security.krb5.debug=true
Created 11-23-2023 05:45 AM
Hi , Were you able to verify client to KDC server connectivity for networking/firewall issue ? You could run using below option to see more details.
-Dsun.security.krb5.debug=true
Created 11-28-2023 09:17 AM
Thanks @Majeti
indeed, when enabling krb5 debug it shows an error when connecting to port 88 despite it was open (tcp).
I just opened 88 port on udp and i got it worked