Support Questions

Find answers, ask questions, and share your expertise

Has anyone done distcp between secured clusters but different REALM?

avatar
2 ACCEPTED SOLUTIONS

avatar

Search of Hortonworks documentation indicates the following three requirements, besides general Kerberos correct setup:

1. Both clusters must be using Java 1.7 or better if you are using MIT kerberos. Java 1.6 has too many known bugs with cross-realm trust; eg see ref http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7061379

2. The same principal name must be assigned to the NameNodes in both the source and the destination cluster. For example, if the Kerberos principal name of the NameNode in cluster A is nn/host1@realm, the Kerberos principal name of the NameNode in cluster B must be nn/host2@realm, not, for example, nn2/host2@realm; see ref http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Sys_Admin_Guides/content/ref-263ee41f-a0a...

3. Bi-directional cross-realm trust must be set up. Correct trust setup can be tested by running an hdfs client on a node from cluster A and see if you can put a file or list a directory on cluster B, and vice versa; credit Robert Molina in the old Hortonworks Forums, post-49303.

Note: the key statement for items #2 and #3 is that "It is important that each NodeManager can reach and communicate with both the source and destination file systems"; see ref https://hadoop.apache.org/docs/r2.7.1/hadoop-distcp/DistCp.html. Therefore the trust must be bi-directional.

View solution in original post

avatar
Contributor
@Matt Foley

I have followed the same steps and when i do distcp between 2 secured HA cluster yarn throws failed to renew token error kind HDFS_DELGATION_TOKEN service: ha-hdfs . i am able to do hadoop fs -ls using the HA on both the cluster. Bothe the cluster has MIT KDC and cross realm setup is done. Bothe the cluster has the same namenode principal. Is there anything else that i need to do?

Just an info , when i change the framework from yarn to MR in mapred-client.xml, i am able to do distcp . when i use the yarn framework i get the above error.

View solution in original post

4 REPLIES 4

avatar

I have not done distcp with different Kerberos REALMS, but I think this should be possible. Our documentation only mentions "same principal name must be assigned to the applicable NameNodes", so that auth_to_local configuration can calculate the same username on both sides (Kerberos principal: nn/host1@realm will be user "nn"). As long as the different realms use the same KDC or the KDCs trust each other, this should be possible.

avatar

Search of Hortonworks documentation indicates the following three requirements, besides general Kerberos correct setup:

1. Both clusters must be using Java 1.7 or better if you are using MIT kerberos. Java 1.6 has too many known bugs with cross-realm trust; eg see ref http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7061379

2. The same principal name must be assigned to the NameNodes in both the source and the destination cluster. For example, if the Kerberos principal name of the NameNode in cluster A is nn/host1@realm, the Kerberos principal name of the NameNode in cluster B must be nn/host2@realm, not, for example, nn2/host2@realm; see ref http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Sys_Admin_Guides/content/ref-263ee41f-a0a...

3. Bi-directional cross-realm trust must be set up. Correct trust setup can be tested by running an hdfs client on a node from cluster A and see if you can put a file or list a directory on cluster B, and vice versa; credit Robert Molina in the old Hortonworks Forums, post-49303.

Note: the key statement for items #2 and #3 is that "It is important that each NodeManager can reach and communicate with both the source and destination file systems"; see ref https://hadoop.apache.org/docs/r2.7.1/hadoop-distcp/DistCp.html. Therefore the trust must be bi-directional.

avatar
Contributor
@Matt Foley

I have followed the same steps and when i do distcp between 2 secured HA cluster yarn throws failed to renew token error kind HDFS_DELGATION_TOKEN service: ha-hdfs . i am able to do hadoop fs -ls using the HA on both the cluster. Bothe the cluster has MIT KDC and cross realm setup is done. Bothe the cluster has the same namenode principal. Is there anything else that i need to do?

Just an info , when i change the framework from yarn to MR in mapred-client.xml, i am able to do distcp . when i use the yarn framework i get the above error.

avatar
@sprakash

The fact that distcp works with some configurations indicates you probably have Security set up right, as well as giving you an obvious work-around. To try to answer your question, please provide some clarifying information:

  1. When you speak of mapred-client.xml, do you mean mapred-site.xml on the client machine?
  2. When you speak of changing the framework, do you mean the "mapreduce.framework.name" configuration parameter in mapred-side.xml?
  3. Do you change it only on the client machine, or throughout both clusters?
  4. The allowed values of that parameter are "local", "classic", and "yarn". When you change it to not be "yarn", what do you set it to?
  5. Do you have "mapreduce.application.framework.path" set? If so, to what value?