<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: distcp failing intermittently to copy the file from one HDFS and another HDFS in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/327573#M230094</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/77994"&gt;@arunek95&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes,the workaround has been applied by following the community posts. As of now .we don't have any root-cause why many files were in OPENFORWRITE state for particular two days in our cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517" target="_blank"&gt;https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Wed, 13 Oct 2021 12:47:41 GMT</pubDate>
    <dc:creator>adhishankarit</dc:creator>
    <dc:date>2021-10-13T12:47:41Z</dc:date>
    <item>
      <title>distcp failing intermittently to copy the file from one HDFS and another HDFS</title>
      <link>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/326149#M229788</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am running distcp command which copies all the audit logs HDFS folder to another HDFS folder for further processing purpose .&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The distcp command used to work fine till 2 weeks ago and started failing since last week .I checked detailed MR logs and understand that only particular file copy failed and other folder/files of audit logs like kafka,hive,nifi and hbase are copied . some specific files copy processing is failing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;distcp command :&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;hadoop distcp -filters $filter_file_loc&amp;nbsp;ranger/audit /data/audit_logs/staging&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Distribution : Cloudera Data Platform version 7.1.7&lt;/P&gt;&lt;P&gt;Please find the detail error messages .&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;java.io.IOException: File copy failed: hdfs://namenode/ranger/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log --&amp;gt; hdfs://namenode&lt;SPAN&gt;/data/audit_logs/staging&lt;/SPAN&gt;/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log&lt;BR /&gt;&lt;BR /&gt;&lt;/PRE&gt;&lt;PRE&gt;Caused by: org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock{BP-1024772623-10.107.146.29-1593441936031:blk_1183449574_109711397; getBlockSize()=64553182; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.107.145.208:9866,DS-b11e932b-0460-47b7-a281-3743ecf9c581,DISK]]} of /ranger/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log
	at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:370)
	at org.apache.hadoop.hdfs.DFSInputStream.getLastBlockLength(DFSInputStream.java:279)
	at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:260)
	at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:203)
	at org.apache.hadoop.hdfs.DFSInputStream.&amp;lt;init&amp;gt;(DFSInputStream.java:187)
	at org.apache.hadoop.hdfs.DFSClient.openInternal(DFSClient.java:1056)
	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1019)
	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:338)
	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:334)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:351)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:954)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.getInputStream(RetriableFileCopyCommand.java:331)&lt;/PRE&gt;&lt;P&gt;@distcp&lt;/P&gt;</description>
      <pubDate>Sun, 03 Oct 2021 12:31:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/326149#M229788</guid>
      <dc:creator>adhishankarit</dc:creator>
      <dc:date>2021-10-03T12:31:13Z</dc:date>
    </item>
    <item>
      <title>Re: distcp failing intermittently to copy the file from one HDFS and another HDFS</title>
      <link>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/327230#M230025</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can see the error as "&lt;/P&gt;&lt;P class="p1"&gt;Caused by: org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock"&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;This basically happens&lt;SPAN&gt;&amp;nbsp;because the file is still in being-written state or has yet not been closed. Please check if the file is in use or is being written during the distcp.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Oct 2021 11:32:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/327230#M230025</guid>
      <dc:creator>arunek95</dc:creator>
      <dc:date>2021-10-11T11:32:36Z</dc:date>
    </item>
    <item>
      <title>Re: distcp failing intermittently to copy the file from one HDFS and another HDFS</title>
      <link>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/327573#M230094</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/77994"&gt;@arunek95&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes,the workaround has been applied by following the community posts. As of now .we don't have any root-cause why many files were in OPENFORWRITE state for particular two days in our cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517" target="_blank"&gt;https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p/117517&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 13 Oct 2021 12:47:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/distcp-failing-intermittently-to-copy-the-file-from-one-HDFS/m-p/327573#M230094</guid>
      <dc:creator>adhishankarit</dc:creator>
      <dc:date>2021-10-13T12:47:41Z</dc:date>
    </item>
  </channel>
</rss>

