Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)

In order to distcp between two HDFS HA cluster (for example A and B), using nameservice id or to setup falcon clusters having namenode ha, these settings are needed.

Assuming nameservice for cluster A and B is HAA and HAB respectively.

One need to set following properties in hdfs-site.xml

Add value of the nameservices of both clusters in dfs.nameservices. This needs to be done in both the clusters.

dfs.nameservices=HAA,HAB

Add property dfs.internal.nameservices

In cluster A: dfs.internal.nameservices = HAA

In cluster B: dfs.internal.nameservices = HAB

Add dfs.ha.namenodes.<nameservice>.

dfs.ha.namenodes.HAB=nn1,nn2

dfs.ha.namenodes.HAA=nn1,nn2

Add property dfs.namenode.rpc-address.<nameservice>.<nn>.

dfs.namenode.rpc-address.HAB.nn1 = <NN1_fqdn>:8020

dfs.namenode.rpc-address.HAB.nn2 = <NN2_fqdn>:8020

dfs.namenode.rpc-address.HAA.nn1 = <NN1_fqdn>:8020

dfs.namenode.rpc-address.HAA.nn2 = <NN2_fqdn>:8020

Add property dfs.client.failover.proxy.provider.<nameservice>

In cluster A dfs.client.failover.proxy.provider.HAB = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

In cluster B dfs.client.failover.proxy.provider.HAA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

Restart HDFS service.

Once complete you will be able to run the distcp command using the nameservice similar to:

hadoop distcp hdfs://HAA/tmp/file1 hdfs://HAB/tmp/

2,160 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎06-17-2016 12:42 PM
Updated by:
 
Contributors
Top Kudoed Authors