<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure) in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141441#M32077</link>
    <description>&lt;P&gt;Firewalls?  Is there a network between them?  Does your user have full permissions to use distcp?&lt;/P&gt;&lt;P&gt;Big data with slow network?&lt;/P&gt;&lt;P&gt;Caused by: java.net.SocketTimeoutException: connect timed out&lt;/P&gt;</description>
    <pubDate>Thu, 16 Jun 2016 03:35:59 GMT</pubDate>
    <dc:creator>TimothySpann</dc:creator>
    <dc:date>2016-06-16T03:35:59Z</dc:date>
    <item>
      <title>DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141439#M32075</link>
      <description>&lt;PRE&gt;Hello team, &lt;/PRE&gt;&lt;P&gt;
	Encountered errors during data migration from CDH4.2 cluster to HDP 2.4 cluster using DISTCP  and below are the details.
Please let me know your thoughts. &lt;/P&gt;&lt;UL&gt;
	
&lt;LI&gt;CDH4.2 NON_HA+Non-secure
NameNode
a.a.a.a IP) castor-namenode-01 
	&lt;/LI&gt;	
&lt;LI&gt;core Hadoop version 2.0.0 &lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;
	HDP2.4 HA + Non-secure &lt;/P&gt;&lt;P&gt;
	Active NameNode hdpmasternode07 &lt;/P&gt;&lt;P&gt;
	Standby NN hdpmasternode06
&lt;A href="http://b.b.b.b:50070/dfshealth.html#tab-overview" target="_blank"&gt;http://b.b.b.b:50070/dfshealth.html#tab-overview&lt;/A&gt; &lt;/P&gt;&lt;P&gt;
	core Hadoop version : 2.7&lt;/P&gt;&lt;P&gt;Passed Test cases:
------------------ &lt;/P&gt;&lt;P&gt;Copy empty dirs from CDH to HDP 2.4 cluster
Accessing the HDP 2.4 hdfs from CDH cluster using webhdfs protocol and vice-versa. &lt;/P&gt;&lt;P&gt;Command used
------------- &lt;/P&gt;&lt;P&gt;hdfs@hdpmasternode07:/$ hadoop distcp hftp://a.a.a.a:50070/tmp/inv261/retail-batch.inv261b.log.gz hdfs://b.b.b.b:8020/tmp&lt;/P&gt;&lt;PRE&gt;Logs:
------


hdfs@hdpmasternode07:/$ hadoop distcp hftp://a.a.a.a:50070/tmp/inv261/retail-batch.inv261b.log.gz hdfs://b.b.b.b:8020/tmp
16/06/15 19:27:45 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hftp://a.a.a.a:50070/tmp/inv261/retail-batch.inv261b.log.gz], targetPath=hdfs://b.b.b.b:8020/tmp, targetPathExists=true, preserveRawXattrs=false}
16/06/15 19:27:45 INFO impl.TimelineClientImpl: Timeline service address: &lt;A href="http://hdpmasternode06:8188/ws/v1/timeline/" target="_blank"&gt;http://hdpmasternode06:8188/ws/v1/timeline/&lt;/A&gt;
16/06/15 19:27:46 INFO impl.TimelineClientImpl: Timeline service address: &lt;A href="http://hdpmasternode06:8188/ws/v1/timeline/" target="_blank"&gt;http://hdpmasternode06:8188/ws/v1/timeline/&lt;/A&gt;
16/06/15 19:27:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2
16/06/15 19:27:46 INFO mapreduce.JobSubmitter: number of splits:1
16/06/15 19:27:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1466011429718_0013
16/06/15 19:27:47 INFO impl.YarnClientImpl: Submitted application application_1466011429718_0013
16/06/15 19:27:47 INFO mapreduce.Job: The url to track the job: &lt;A href="http://hdpmasternode07:8088/proxy/application_1466011429718_0013/" target="_blank"&gt;http://hdpmasternode07:8088/proxy/application_1466011429718_0013/&lt;/A&gt;
16/06/15 19:27:47 INFO tools.DistCp: DistCp job-id: job_1466011429718_0013
16/06/15 19:27:47 INFO mapreduce.Job: Running job: job_1466011429718_0013
16/06/15 19:27:53 INFO mapreduce.Job: Job job_1466011429718_0013 running in uber mode : false
16/06/15 19:27:53 INFO mapreduce.Job:  map 0% reduce 0%
16/06/15 19:28:03 INFO mapreduce.Job:  map 100% reduce 0%
16/06/15 19:31:04 INFO mapreduce.Job: Task Id : attempt_1466011429718_0013_m_000000_0, Status : FAILED
Error: java.io.IOException: File copy failed: hftp://a.a.a.a:50070/tmp/inv261/retail-batch.inv261b.log.gz --&amp;gt; hdfs://b.b.b.b:8020/tmp/retail-batch.inv261b.log.gz
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:285)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:253)
	at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:50)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying hftp://a.a.a.a:50070/tmp/inv261/retail-batch.inv261b.log.gz to hdfs://b.b.b.b:8020/tmp/retail-batch.inv261b.log.gz
	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
	at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:281)
	... 10 more
Caused by: org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: java.net.SocketTimeoutException: connect timed out
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.getInputStream(RetriableFileCopyCommand.java:302)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:247)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToFile(RetriableFileCopyCommand.java:183)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:123)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99)
	at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
	... 11 more
Caused by: java.net.SocketTimeoutException: connect timed out
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:579)
	at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
	at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
	at sun.net.www.http.HttpClient.&amp;lt;init&amp;gt;(HttpClient.java:211)
	at sun.net.www.http.HttpClient.New(HttpClient.java:308)
	at sun.net.www.http.HttpClient.New(HttpClient.java:326)
	at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:998)
	at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:934)
	at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:852)
	at sun.net.www.protocol.http.HttpURLConnection.followRedirect(HttpURLConnection.java:2412)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1559)
	at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:468)
	at org.apache.hadoop.hdfs.web.HftpFileSystem$RangeHeaderUrlOpener.connect(HftpFileSystem.java:370)
	at org.apache.hadoop.hdfs.web.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:135)
	at org.apache.hadoop.hdfs.web.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:116)
	at org.apache.hadoop.hdfs.web.ByteRangeInputStream.&amp;lt;init&amp;gt;(ByteRangeInputStream.java:101)
	at org.apache.hadoop.hdfs.web.HftpFileSystem$RangeHeaderInputStream.&amp;lt;init&amp;gt;(HftpFileSystem.java:383)
	at org.apache.hadoop.hdfs.web.HftpFileSystem$RangeHeaderInputStream.&amp;lt;init&amp;gt;(HftpFileSystem.java:388)
	at org.apache.hadoop.hdfs.web.HftpFileSystem.open(HftpFileSystem.java:404)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.getInputStream(RetriableFileCopyCommand.java:298)
	... 16 more


&lt;/PRE&gt;</description>
      <pubDate>Thu, 16 Jun 2016 02:47:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141439#M32075</guid>
      <dc:creator>Arun-</dc:creator>
      <dc:date>2016-06-16T02:47:24Z</dc:date>
    </item>
    <item>
      <title>Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141440#M32076</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/312/avoma.html" nodeid="312"&gt;@avoma&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Are you able to do hadoop fs -ls to both clusters from HDP2.4.x cluster? Try&lt;/P&gt;&lt;P&gt;hadoop fs -ls hdfs://source_cluster_Active_NN:8020/tmp &lt;/P&gt;&lt;P&gt;hadoop fs -ls hdfs://destination_cluster_Active_NN:8020/tmp&lt;/P&gt;&lt;P&gt;Also, check if host is reachable from destination cluster host. &lt;/P&gt;&lt;P&gt;And if above works, then try running Distcp with other user than hdfs. Try below:&lt;/P&gt;&lt;P&gt;hadoop distcp -strategy dynamic -prgbup \&lt;/P&gt;&lt;P&gt;-&amp;lt;overwrite/update&amp;gt; \ &lt;/P&gt;&lt;P&gt;hdfs://source_cluster_Active_NN:8020/&amp;lt;test_file_path&amp;gt; \&lt;/P&gt;&lt;P&gt;hdfs://destination_cluster_Active_NN:8020/&amp;lt;Test_file_path&amp;gt; &lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 02:55:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141440#M32076</guid>
      <dc:creator>pardeep_kumar</dc:creator>
      <dc:date>2016-06-16T02:55:31Z</dc:date>
    </item>
    <item>
      <title>Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141441#M32077</link>
      <description>&lt;P&gt;Firewalls?  Is there a network between them?  Does your user have full permissions to use distcp?&lt;/P&gt;&lt;P&gt;Big data with slow network?&lt;/P&gt;&lt;P&gt;Caused by: java.net.SocketTimeoutException: connect timed out&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 03:35:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141441#M32077</guid>
      <dc:creator>TimothySpann</dc:creator>
      <dc:date>2016-06-16T03:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141442#M32078</link>
      <description>&lt;P&gt;As discussed with &lt;A rel="user" href="https://community.cloudera.com/users/312/avoma.html" nodeid="312"&gt;@avoma&lt;/A&gt;, issue is resolved, it was due to incorrect entries in /etc/hosts file. &lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 04:58:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141442#M32078</guid>
      <dc:creator>pardeep_kumar</dc:creator>
      <dc:date>2016-06-16T04:58:54Z</dc:date>
    </item>
    <item>
      <title>Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141443#M32079</link>
      <description>&lt;P&gt;Thanks for your responses.Problem is with Network using publicIp's instead of private Ip..updates source cluster private ip's in destination cluster.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jun 2016 23:03:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141443#M32079</guid>
      <dc:creator>Arun-</dc:creator>
      <dc:date>2016-06-16T23:03:43Z</dc:date>
    </item>
    <item>
      <title>Re: DISTCP fails from CHD4.2(Non HA+Non-secure)  to HDP 2.4(HA+Non-secure)</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141444#M32080</link>
      <description>&lt;P&gt;will be helpful&lt;/P&gt;</description>
      <pubDate>Thu, 14 Jul 2016 00:32:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DISTCP-fails-from-CHD4-2-Non-HA-Non-secure-to-HDP-2-4-HA-Non/m-p/141444#M32080</guid>
      <dc:creator>ripu</dc:creator>
      <dc:date>2016-07-14T00:32:37Z</dc:date>
    </item>
  </channel>
</rss>

