Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

distcp failing intermittently to copy the file from one HDFS and another HDFS

avatar
Contributor

Hi All,

 

I am running distcp command which copies all the audit logs HDFS folder to another HDFS folder for further processing purpose . 

 

The distcp command used to work fine till 2 weeks ago and started failing since last week .I checked detailed MR logs and understand that only particular file copy failed and other folder/files of audit logs like kafka,hive,nifi and hbase are copied . some specific files copy processing is failing.

 

distcp command :

hadoop distcp -filters $filter_file_loc ranger/audit /data/audit_logs/staging

 

Distribution : Cloudera Data Platform version 7.1.7

Please find the detail error messages .

 

java.io.IOException: File copy failed: hdfs://namenode/ranger/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log --> hdfs://namenode/data/audit_logs/staging/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log

Caused by: org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock{BP-1024772623-10.107.146.29-1593441936031:blk_1183449574_109711397; getBlockSize()=64553182; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.107.145.208:9866,DS-b11e932b-0460-47b7-a281-3743ecf9c581,DISK]]} of /ranger/audit/kafka/kafka/20210927/kafka_ranger_audit_svl.host.int.log
	at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:370)
	at org.apache.hadoop.hdfs.DFSInputStream.getLastBlockLength(DFSInputStream.java:279)
	at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:260)
	at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:203)
	at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:187)
	at org.apache.hadoop.hdfs.DFSClient.openInternal(DFSClient.java:1056)
	at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1019)
	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:338)
	at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:334)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:351)
	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:954)
	at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.getInputStream(RetriableFileCopyCommand.java:331)

@distcp

2 REPLIES 2

avatar
Expert Contributor

Hi, 

 

I can see the error as "

Caused by: org.apache.hadoop.hdfs.CannotObtainBlockLengthException: Cannot obtain block length for LocatedBlock" 

This basically happens because the file is still in being-written state or has yet not been closed. Please check if the file is in use or is being written during the distcp.

avatar
Contributor

Hi @arunek95 

 

Yes,the workaround has been applied by following the community posts. As of now .we don't have any root-cause why many files were in OPENFORWRITE state for particular two days in our cluster.

 

https://community.cloudera.com/t5/Support-Questions/Cannot-obtain-block-length-for-LocatedBlock/td-p...

 

Thanks