Community Articles
Find and share helpful community-sourced technical articles
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)

In order to distcp between two HDFS HA cluster (for example A and B), using nameservice id or to setup falcon clusters having namenode ha, these settings are needed.

Assuming nameservice for cluster A and B is HAA and HAB respectively.

One need to set following properties in hdfs-site.xml

Add value of the nameservices of both clusters in dfs.nameservices. This needs to be done in both the clusters.


Add property dfs.internal.nameservices

In cluster A: dfs.internal.nameservices = HAA

In cluster B: dfs.internal.nameservices = HAB

Add dfs.ha.namenodes.<nameservice>.



Add property dfs.namenode.rpc-address.<nameservice>.<nn>.

dfs.namenode.rpc-address.HAB.nn1 = <NN1_fqdn>:8020

dfs.namenode.rpc-address.HAB.nn2 = <NN2_fqdn>:8020

dfs.namenode.rpc-address.HAA.nn1 = <NN1_fqdn>:8020

dfs.namenode.rpc-address.HAA.nn2 = <NN2_fqdn>:8020

Add property dfs.client.failover.proxy.provider.<nameservice>

In cluster A dfs.client.failover.proxy.provider.HAB = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

In cluster B dfs.client.failover.proxy.provider.HAA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

Restart HDFS service.

Once complete you will be able to run the distcp command using the nameservice similar to:

hadoop distcp hdfs://HAA/tmp/file1 hdfs://HAB/tmp/

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎06-17-2016 12:42 PM
Updated by:
Top Kudoed Authors