Support Questions

joshua_adeleke · ‎07-08-2016

I get an error "Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 18, <datanode>): java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1426797840-1461158403571:blk_1089439824_15699635; getBlockSize()=0; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[,DISK], DatanodeInfoWithStorage[,DISK], DatanodeInfoWithStorage[,DISK]]}"

The background to this is: We changed our Oozie DB to MySQL and on restarting the cluster, the namenode failed with connectionRefused error. It was started manually from the CLI and after restarted with Ambari and it worked fine. I have used hdfs fsck to check for corrupt files but i get a 'healthy' status report.

Any clue as to how i can get past this issue? @Kuldeep Kulkarni @Artem Ervits @Sagar Shimpi @Benjamin Leonhardi

joshua_adeleke · ‎08-01-2016

This issue was resolved by restarting the namenode.

View solution in original post

srai1 · ‎07-08-2016

https://community.hortonworks.com/questions/37412/cannot-obtain-block-length-for-locatedblock.html

joshua_adeleke · ‎07-08-2016

@srai Please find below the report i got from running hdfs fsck /

...............................Status: HEALTHY Total size: 84086783290897 B (Total open files size: 35725218 B) Total dirs: 255918 Total files: 7090531 Total symlinks: 0 (Files currently being written: 464) Total blocks (validated): 7131287 (avg. block size 11791249 B) (Total open file blocks (not validated): 86) Corrupt blocks: 0 Number of data-nodes: 8 Number of racks: 1 FSCK ended at Fri Jul 08 16:43:47 SAST 2016 in 141368 milliseconds The filesystem under path '/' is HEALTHY

elserj · ‎07-08-2016

Can you share the hdfs fsck command you ran? It definitely sounds like HDFS is not healthy.

joshua_adeleke · ‎07-08-2016

Here is another one below, Josh.

Status: HEALTHY Total size: 84184775260004 B (Total open files size: 36288883 B) Total dirs: 255954 Total files: 7102482 Total symlinks: 0 (Files currently being written: 456) Total blocks (validated): 7143238 (avg. block size 11785240 B) (Total open file blocks (not validated): 79) Minimally replicated blocks: 7143238 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 130 (0.0018199029 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.9979758 Corrupt blocks: 0 Missing replicas: 257 (0.0012000647 %) Number of data-nodes: 8 Number of racks: 1 FSCK ended at Fri Jul 08 18:29:19 SAST 2016 in 239594 milliseconds The filesystem under path '/' is HEALTHY

elserj · ‎07-08-2016

Thanks, @Joshua Adeleke. Like in the other question linked by Srai, if you know the specific file(s) your job is reading, you could try to use the `hdfs debug recoverLease` command on those files. Normally, a lease on an HDFS file will expire automatically if the writer abnormally goes away without closing the file. If you are sure no client is trying to write the file, you could try the recoverLease to force the NN to let this operation succeed.

joshua_adeleke · ‎08-01-2016

This issue was resolved by restarting the namenode.

Cloudera Community

Support Questions

Unable to load into HBase due to error with data blocks