Support Questions
Find answers, ask questions, and share your expertise

distcp between 2 kerberized clusters. Fails due to permissions

Contributor

Hi all,

I have 2 kerberized clusters, both connected to the same AD, one of them with HDP 2.4 and the other with HDP 2.5. Now I would like to move all the data from one cluster to another.

I have been reading a lot about it, like the following links:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Sys_Admin_Guides/content/ref-c8ffaa14-ea...

https://community.hortonworks.com/articles/18686/kerberos-cross-realm-trust-for-distcp.html

What I am doing is the following:

As the hdfs user in cluster 1, I can list all the files, but I can copy only the files for which I have explicit permissions for the hdfs user. For example:

A file with permissions 770 for user user1 and the group hdfs can be copied.

But a file with permissions 700 for user user1 and the group hdfs or another group, cannot be copied.

Also, the second cluster is configured in HA, but I cannot used the name defined in HA, I have to point directly to the active master namenode (which can be different each time)

With the following command:

hadoop distcp hdfs://master01/projects/folder hdfs://manager01/projects/.

If I don't have permissions for hdfs, I obtain the following error:

17/02/01 12:04:56 INFO mapreduce.Job: Task Id : attempt_1485252670123_0029_m_000003_1, Status : FAILED Error: java.io.IOException: File copy failed: hdfs://cluster01/projects/folder --> hdfs://cluster02/projects/folder org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:285) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

What should I do to copy all the files? Change first all the permissions to 777?

Thanks in advance

3 REPLIES 3

@Jose Molero

Can you provide output for below commands?

hadoop fs -ls /projects/folder --on Cluster 1

hadoop fs -ls /projects/folder --on Cluster 2

@Jose Molero

Can you check your core-site.xml property "hadoop.security.auth_to_local"

Ref link: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_Sys_Admin_Guides/content/ref-c8ffaa14-ea...

Mentor

@Jose Molero

I think you have to readjust your krb5.conf the trick lies in the CAPATHS check that on the 2 clusters you have identical configuration.

Please go through the attached document it should help you .