Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to load into HBase due to error with data blocks

avatar
Expert Contributor

I get an error "Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 18, <datanode>): java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1426797840-1461158403571:blk_1089439824_15699635; getBlockSize()=0; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[,DISK], DatanodeInfoWithStorage[,DISK], DatanodeInfoWithStorage[,DISK]]}"

The background to this is: We changed our Oozie DB to MySQL and on restarting the cluster, the namenode failed with connectionRefused error. It was started manually from the CLI and after restarted with Ambari and it worked fine. I have used hdfs fsck to check for corrupt files but i get a 'healthy' status report.

Any clue as to how i can get past this issue? @Kuldeep Kulkarni @Artem Ervits @Sagar Shimpi @Benjamin Leonhardi

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This issue was resolved by restarting the namenode.

View solution in original post

6 REPLIES 6

avatar
Guru

avatar
Expert Contributor

@srai Please find below the report i got from running hdfs fsck /

...............................Status: HEALTHY Total size: 84086783290897 B (Total open files size: 35725218 B) Total dirs: 255918 Total files: 7090531 Total symlinks: 0 (Files currently being written: 464) Total blocks (validated): 7131287 (avg. block size 11791249 B) (Total open file blocks (not validated): 86) Corrupt blocks: 0 Number of data-nodes: 8 Number of racks: 1 FSCK ended at Fri Jul 08 16:43:47 SAST 2016 in 141368 milliseconds The filesystem under path '/' is HEALTHY

avatar
Super Guru

Can you share the hdfs fsck command you ran? It definitely sounds like HDFS is not healthy.

avatar
Expert Contributor

Here is another one below, Josh.

Status: HEALTHY Total size: 84184775260004 B (Total open files size: 36288883 B) Total dirs: 255954 Total files: 7102482 Total symlinks: 0 (Files currently being written: 456) Total blocks (validated): 7143238 (avg. block size 11785240 B) (Total open file blocks (not validated): 79) Minimally replicated blocks: 7143238 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 130 (0.0018199029 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.9979758 Corrupt blocks: 0 Missing replicas: 257 (0.0012000647 %) Number of data-nodes: 8 Number of racks: 1 FSCK ended at Fri Jul 08 18:29:19 SAST 2016 in 239594 milliseconds The filesystem under path '/' is HEALTHY

avatar
Super Guru

Thanks, @Joshua Adeleke. Like in the other question linked by Srai, if you know the specific file(s) your job is reading, you could try to use the `hdfs debug recoverLease` command on those files. Normally, a lease on an HDFS file will expire automatically if the writer abnormally goes away without closing the file. If you are sure no client is trying to write the file, you could try the recoverLease to force the NN to let this operation succeed.

avatar
Expert Contributor

This issue was resolved by restarting the namenode.