07-19-2017 10:58 AM
Our "catch22" situation: we been continually getting the java.nio.channels.UnresolvedAddressException trying to do a distcp between our 2 clusters. Our setup is the following: running CDH5, 2 separate clusters with their own internal ips and external ips do they can communicate. The 8020 and 50010 ports are open between all of them. I think the internal ips are the problem.
I run the following command from one of the datanodes on the namenode1 cluster
distcp hdfs://namenode1:8020/somefile hdfs://namenode2:8020/
each of the namenode and datanodes on each cluster have the external ips for the other one in their /etc/hosts
The above commands fails with the following doctored error message (my system has no access to internet)
.....Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) ......
The unresolved address appears to be the one on namenode2 cluster. After many review of posts on similar issues it was suggested to set "dfs.client.use.datanode.hostname=false". When I do this it gets past the above error attempts to create a file on the destination but then fails with the following error:
Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect...... .... Abandoning BP-xxxxxxx- <the internal ip address>xxxxxxxx ...Excluding datanode DatanodeInfoWithStorage[<the internal ip>:50010, xxxxxx ......
Note that an empty file is created on the destination "tmp" directory.
After reviewing more posts on this error it says you NEED to have "dfs.client.use.datanode.hostname=true" which kind of makes sense BUT puts me back to the first error.
Thanks for any assistance of where else to look. I can provide more detail if necessary.
07-19-2017 01:35 PM
In general, the namenode port # is 50070. We can also customize, so You can double check that in CM -> HDFS -> Configuration.
so you need to use namenode1:50070 instead of namenode1:8020 (in both source & target)
also i saw that you have mentioned CDH 5, but make sure both source & target are same version (including sub versions like 5.2, 5.3, 5.7, etc). If they are not same version then you need to use different commands, you can get more details in the below link
07-20-2017 07:37 AM
Thanks for the response. I think that 50070 is the hftp port and both clusters are the same version of CDH5.
I was able to make progress in one direction only. First I added the remote namenode into the hadoop-conf/masters file and open up the 8020 ports for 0.0.0.0/0 (wide open for some reason). so the following command worked from cluster1:
distcp hdfs://namenode2:8020/file hdfs://namenode1:8020/
So basically where the source is the remote cluster. I was able to do the same sort of thing from the cluster2. I still cannot do it where the source is the same cluster.
The bottom line is we are trying to do an HBASE snapshot export between 2 clusters and kept receiving the UnresolvedAddressException. Since the export does the distcp under the covers I was trying to debug that first but still get the same results doing the snapshot export:
hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot mySnapshot -copy-to hdfs://namenode2/hbase
07-20-2017 08:27 AM