Support Questions
Find answers, ask questions, and share your expertise

Command Distcp is not working HDP 2.6.1 (kerberos) to CDP Private Cloud 7.1.5 (non kerberos) and shows errors

Explorer

Hi Community

 

I am trying to copy HDFS from an HDP 2.6.x cluster (kerberized) to a CDP Private Cloud Base 7.1.5 cluster (not kerberized) and using other ports as well, it gives me an error

 

So I write the command in console

hadoop distcp hdfs://svr2.localdomain:1019/tmp/distcp_test.txt hdfs://svr1.local:9866/tmp/

 

what could be the origin of the fault? 

 

Thank you

 

***********ERROR*************************

Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
>>>KinitOptions cache name is /tmp/krb5cc_11259
>>>DEBUG <CCacheInputStream> client principal is useradm/admin@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> server principal is krbtgt/LOCALDOMAIN@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> key type: 18
>>>DEBUG <CCacheInputStream> auth time: Sun May 09 18:39:04 VET 2021
>>>DEBUG <CCacheInputStream> start time: Sun May 09 18:39:04 VET 2021
>>>DEBUG <CCacheInputStream> end time: Mon May 10 18:39:04 VET 2021
>>>DEBUG <CCacheInputStream> renew_till time: Sun May 16 18:39:04 VET 2021
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL;
>>>DEBUG <CCacheInputStream> client principal is useradm/admin@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> server principal is X-CACHECONF:/krb5_ccache_conf_data/fast_avail/krbtgt/LOCALDOMAIN@LOCALDOMAIN@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> key type: 0
>>>DEBUG <CCacheInputStream> auth time: Wed Dec 31 20:00:00 VET 1969
>>>DEBUG <CCacheInputStream> start time: null
>>>DEBUG <CCacheInputStream> end time: Wed Dec 31 20:00:00 VET 1969
>>>DEBUG <CCacheInputStream> renew_till time: null
>>> CCacheInputStream: readFlags()
21/05/09 19:23:36 WARN ipc.Client: Exception encountered while connecting to the server : java.io.EOFException
21/05/09 19:23:36 ERROR tools.DistCp: Invalid arguments:
java.io.IOException: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "svr2.localdomain/10.x.x.x"; destination host is: "svr1.local":9866;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:818)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:227)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:118)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:462)
Caused by: java.io.IOException: java.io.EOFException
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:720)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 22 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757)
... 25 more
Invalid arguments: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "server2.localdomain/10.x.x.x"; destination host is: "svr1.local":9866;

 

Thanks!

1 ACCEPTED SOLUTION

Expert Contributor

Your command should ideally look like this:

 

hadoop distcp -Dipc.client.fallback-to-simple-auth-allowed=true hdfs://svr2.localdomain:8020/tmp/distcp_test.txt webhdfs://svr1.local:50070/tmp/

 

Let me know how it goes.

 

Thanks,

Megh

View solution in original post

11 REPLIES 11

Cloudera Employee

Hi @vciampa ,

 

Looks like the arguments that are been passed is invalid

Invalid arguments: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "server2.localdomain/10.x.x.x"; destination host is: "svr1.local":9866;

 

Can you try to use source://nameservice:port dest://nameservice:port and try to run the distcp once.

Explorer

Source Server = kerberos

By console:::

hadoop distcp hdfs://svr2.localdomain:1019/tmp/distcp_test.txt hdfs://svr1.local:9866/tmp

 

***************ERROR:::::::******************

Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
>>>KinitOptions cache name is /tmp/krb5cc_11259
>>>DEBUG <CCacheInputStream> client principal is useradm/admin@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> server principal is krbtgt/LOCALDOMAIN@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> key type: 18
>>>DEBUG <CCacheInputStream> auth time: Tue May 11 08:24:10 VET 2021
>>>DEBUG <CCacheInputStream> start time: Tue May 11 08:24:10 VET 2021
>>>DEBUG <CCacheInputStream> end time: Wed May 12 08:24:10 VET 2021
>>>DEBUG <CCacheInputStream> renew_till time: Tue May 18 08:24:10 VET 2021
>>> CCacheInputStream: readFlags() FORWARDABLE; RENEWABLE; INITIAL;
>>>DEBUG <CCacheInputStream> client principal is useradm/admin@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> server principal is X-CACHECONF:/krb5_ccache_conf_data/fast_avail/krbtgt/LOCALDOMAIN@LOCALDOMAIN@LOCALDOMAIN
>>>DEBUG <CCacheInputStream> key type: 0
>>>DEBUG <CCacheInputStream> auth time: Wed Dec 31 20:00:00 VET 1969
>>>DEBUG <CCacheInputStream> start time: null
>>>DEBUG <CCacheInputStream> end time: Wed Dec 31 20:00:00 VET 1969
>>>DEBUG <CCacheInputStream> renew_till time: null
>>> CCacheInputStream: readFlags()
21/05/11 09:15:33 WARN ipc.Client: Exception encountered while connecting to the server : java.io.EOFException
21/05/11 09:15:33 ERROR tools.DistCp: Invalid arguments:
java.io.IOException: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "svr1.localdomain/10.x.x.x"; destination host is: "svr2.locall":9866;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:818)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:227)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:118)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:462)
Caused by: java.io.IOException: java.io.EOFException
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:720)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:683)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:770)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 22 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:595)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:397)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:762)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:758)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:757)
... 25 more
Invalid arguments: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "svr1.localdomain/10.x.x.x"; destination host is: "svr2.locall":9866;
usage: distcp OPTIONS [source_path...] <target_path>

*******************************************

more information:::

1) Ports= 9866 (dfs.datanode.address) - OPEN Port

2) Ports=1019 (dfs.datanode.address) - OPEN Port

3) svr1.localdomain = Kerberos-Enabled (Source - Source Files)

4) svr2.locall = non-Kerberos (Target - Destination Files)

 

Thanks

Cloudera Employee

Can you please have a check if you have made the changes as per the below doc 

 

https://docs.cloudera.com/cdp-private-cloud/latest/data-migration/topics/rm-migrate-securehdp-insecu...

 

As I see that you are migrating data from (Secured)HDP cluster to (unsecured)CDP cluster. Please correct me if my understanding is incorrect. 

 

Explorer

Hi @Tylenol

it is not a migration.
It is a new installation and I need to copy information from the HDP cluster to the CDP.

 

Thanks

 

 

Cloudera Employee

Ah ! got it .. thanks for update ! Yeah can you refer to the article once and can you also try to use copy from source namenode to destination namenode like this 

 

hdfs://nn1:8020/foo/a
hdfs://nn1:8020/foo/b

https://hadoop.apache.org/docs/r3.0.3/hadoop-distcp/DistCp.html 

Explorer

Hi @Tylenol 

****Console command:

hadoop distcp hdfs://server4.localdomain:8020/tmp/distcp_test.txt hdfs://server8.local:8020/tmp

 

****NOTE:

server4 (source-HDP-kerberos) and server8(target-CDP-non-kerberos) = NameNodes

 

*************ERROR ****************
Java config name: null
Native config name: /etc/krb5.conf
Loaded from native config
>>>KinitOptions cache name is /tmp/krb5cc_11259
21/05/11 13:34:10 ERROR tools.DistCp: Invalid arguments:
java.io.IOException: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "server4.localdomain/10.x.x.x"; destination host is: "server8.local":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:818)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:227)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:118)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:462)
Caused by: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:787)
at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1620)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
... 22 more
Invalid arguments: Failed on local exception: java.io.IOException: Server asks us to fall back to SIMPLE auth, but this client is configured to only allow secure connections.; Host Details : local host is: "server4.localdomain/10.x.x.x.x"; destination host is: "server8.local":8020;

***********************************************

 

Thanks!

Cloudera Employee

Looks like the fall back mechanism isn't been added 

 

A fall back configuration is required at destination when running DistCP to copy files between a secure and an insecure cluster.
Adding the following property to the advanced configuration snippet (if using Cloudera Manager) or of not, add it directly to the HDFS core-site.xml:

<property>
<name>ipc.client.fallback-to-simple-auth-allowed</name>
<value>true</value>
</property>

 

https://my.cloudera.com/knowledge/Copying-Files-from-Insecure-to-Secure-Cluster-using-DistCP?id=7487...

 

Expert Contributor

Hi @vciampa ,

 

In addition to the solution suggested by @Tylenol , also use webhdfs instead of hdfs for your destination as EOFException seems to occur between different versions of Hadoop during distcp.

 

Please paste your command and logs after trying this.

 

Thanks,

Megh

Expert Contributor

Your command should ideally look like this:

 

hadoop distcp -Dipc.client.fallback-to-simple-auth-allowed=true hdfs://svr2.localdomain:8020/tmp/distcp_test.txt webhdfs://svr1.local:50070/tmp/

 

Let me know how it goes.

 

Thanks,

Megh

Explorer

@vidanimegh, Thank you very much for your help.

Explorer

@Tylenol, Thank you very much for your help.

; ;