Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Distcp is failing in HA

avatar
Guru

When I am trying to do distcp in High Availability Cluster then it is failing with below error.

[s0998@test ~]$ hadoop distcp  hdfs://HDPINFHA/user/s0998/sampleTest.txt hdfs://HDPTSTHA/user/root/
16/02/29 06:32:38 ERROR tools.DistCp: Invalid arguments: 
java.lang.IllegalArgumentException: java.net.UnknownHostException: HDPTSTHA
	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406)
	at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311)
	at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
	at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
	at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
	at org.apache.hadoop.tools.DistCp.setTargetPathExists(DistCp.java:216)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:116)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:430)
Caused by: java.net.UnknownHostException: HDPTSTHA

Though I have configured via below urls.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0-Win/bk_HDP_RelNotes_Win/content/behav-change...

1 ACCEPTED SOLUTION

avatar
Master Mentor
11 REPLIES 11

avatar
Master Mentor

please see this blog and double check your values http://henning.kropponline.de/2015/03/15/distcp-two-ha-cluster/

avatar
Guru

@Artem Ervits: When I changed dfs.nameservices to both cluster then I am not able to restart hdfs services.

resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w '%{http_code}' -X GET 'http://m1.hdp22:50070/webhdfs/v1/tmp?op=GETFILESTATUS&user.name=hdfs'' returned status_code=403. 
{
  "RemoteException": {
    "exception": "StandbyException", 
    "javaClassName": "org.apache.hadoop.ipc.StandbyException", 
    "message": "Operation category READ is not supported in state standby"
  }
}

avatar
Master Mentor
@Saurabh Kumar

only use the link I provided to double check your values, for all values refer to our docs as you did. Did you read this paragraph clearly from the blog?

"The other alternative is to configure the client with both service ids and make it aware of the way to identify the active NameNode of both clusters. For this you would need to define a custom configuration you are only going to use for distcp. The hdfs client can be configured to point to that config like this"

create a custom xml file and pass it to hadoop disctp command every time you want to distcp. Don't use that config as your global config for hdfs. Revert back the configuration to previous in Ambari and create a custom hdfs-site.xml in your user directory, pass it to hadoop distcp and report results back.

avatar
Guru

@Artem Ervits: I tried with external dir as well but getting below error.

[s0998dnz@lxhdpmastinf001 ~]$ hadoop --config conf/ distcp hdfs://HDPINFHA/user/s0998dnz/sampleTest.txt hdfs://HDPTSTHA/user/root/ 16/03/01 07:40:35 ERROR tools.DistCp: Invalid arguments: java.lang.IllegalArgumentException: java.net.UnknownHostException: HDPTSTHA at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:406) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:311) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176

avatar
Master Mentor

@Saurabh Kumar I don't have an HA cluster to test with but testing on Sandbox worked for me with hadoop command, not hdfs. Please double-check your properties. The safest route is to determine the active namenode at the time of copy, I agree it's not the most optimal solution. Tagging experts @stevel @Chris Nauroth

cp /etc/hadoop/conf/hdfs-site.xml distcp.xml
mkdir confdir && mv distcp.xml confdir
hadoop --config confdir distcp hdfs://sandbox.hortonworks.com:8020/user/root/sample.json hdfs://sandbox.hortonworks.com:8020/user/root/sample.json5

avatar

"The safest route is to determine the active namenode at the time of copy,"

This would have an unfortunate side effect. Referencing the active NameNode's address directly means that the DistCp job wouldn't be able to survive an HA failover. If there was a failover in the middle of a long-running DistCp job, then you'd likely need to restart it from the beginning.

The HDFS-6376 patch mentioned throughout this question should be sufficient to enable a DistCp across HA clusters, assuming you are running an HDP version that has the patch. The original question includes a link to HDP 2.3 docs. If that is the version you are running, then that's fine, because HDFS-6376 is included in all HDP 2.3 releases. This is tested regularly and confirmed to be working.

If all else fails, then this sounds like a reason to file a support case for additional hands-on troubleshooting with your particular cluster. That might be more effective than trying to resolve it through HCC.

avatar
Master Mentor

avatar
Master Mentor
@Saurabh Kumar
In order to distcp between two HDFS HA cluster (for example A and B),  modify the following in the hdfs-site.xml for both clusters:

For example, nameservice for cluster A and B is HAA and HAB respectively.

- Add value to the nameservice for both clusters dfs.nameservices = HAA, HAB

- Add property dfs.internal.nameservices
In cluster A:
dfs.internal.nameservices = HAA
In cluster B:
dfs.internal.nameservices = HAB

- Add dfs.ha.namenodes.<nameservice> 
In cluster A
dfs.ha.namenodes.HAB = nn1,nn2
In cluster B
dfs.ha.namenodes.HAA = nn1,nn2

- Add property dfs.namenode.rpc-address.<cluster>.<nn>
In cluster A
dfs.namenode.rpc-address.HAB.nn1 = <NN1_fqdn>:8020 
dfs.namenode.rpc-address.HAB.nn2 = <NN2_fqdn>:8020
In cluster B
dfs.namenode.rpc-address.HAA.nn1 = <NN1_fqdn>:8020 
dfs.namenode.rpc-address.HAA.nn2 = <NN2_fqdn>:8020

- Add property dfs.client.failover.proxy.provider.<cluster - i.e HAA or HAB>
In cluster A
dfs.client.failover.proxy.provider.HAB = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
In cluster B
dfs.client.failover.proxy.provider.HAA = org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

- Restart HDFS service.

Once complete you will be able to run the distcp command using the nameservice similar to:
hadoop distcp hdfs://HDPINFHA/tmp/testDistcp hdfs://HDPTSTHA/tmp/

avatar
Guru

@Neeraj Sabharwal:

I followed the same but still getting same error.