Created 07-20-2016 09:00 AM
I was running an insert query in Hive then i encountered the error below;
ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1468411845662_1749_1_00, diagnostics=[Task failed, taskId=task_1468411845662_1749_1_00_000024, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1426797840-<ip_address>-1461158403571:blk_1090740708_17008023;
Any clue on how to get past this?
Note that when i run hdfs fsck command, it returns a healthy status.
Please find the full hive error log and fsck status report attached.
Created 08-01-2016 07:07 AM
This issue was resolved by restarting the namenode.
Created 07-20-2016 09:04 AM
seems hdfs block is corrupted, can you check with hdfs fsck which will give you what files are having problem
Created 07-20-2016 09:10 AM
@Rajkumar Singh Find below the file system check...
...................................................................Status: HEALTHY Total size: 102660394304099 B (Total open files size: 1000539276 B) Total dirs: 278839 Total files: 8403467 Total symlinks: 0 (Files currently being written: 1044) Total blocks (validated): 8364313 (avg. block size 12273619 B) (Total open file blocks (not validated): 658) Minimally replicated blocks: 8364313 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 40135 (0.47983617 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 2.9917786 Corrupt blocks: 0 Missing replicas: 54630 (0.21783337 %) Number of data-nodes: 8 Number of racks: 1 FSCK ended at Wed Jul 20 10:51:25 SAST 2016 in 152457 milliseconds The filesystem under path '/' is HEALTHY
Created 07-20-2016 09:22 AM
is this intermittent or happen constantly?
Created 07-20-2016 10:17 AM
constantly!
Created 07-20-2016 03:37 PM
@Josh Elser I was reading some jiras and found you may have run into similar issues. Doing some general searching no one has a great answer for this. Some restart HDFS. Some manually copy up the block into hdfs as the WAL to make progress. I also found this may be due to full disk problem. your thoughts.
Created 07-21-2016 08:27 AM
I agree with you on the disk-full-problem as this was a case where the log directory was full but it seems the namenode has not recovered from that even after several restarts. I would have thought there's a "-repair" option for the fsck command just like the hbck command. My question: How can we get the namenode to update its metadata so we can resolve this block location issue once and for all?
Created 08-01-2016 07:07 AM
This issue was resolved by restarting the namenode.