Support Questions
Find answers, ask questions, and share your expertise

Hive query failed with java.io.IOException: Cannot obtain block length for LocatedBlock

Explorer

Dear Community,

I tried to run a simple query like: SELECT * FROM tweets_raw OR SELECT Count(*) FROM tweets_raw

It look like hive need special considerations when the external table grow.

When I had less than 30.000 rows, every things was OK, but once the table grow, different kind of problems begin to appear.

In this case, the query was interrupted with the java exception (see below).

I will appreciated any suggestions on how to solve this issue.

Best regards,

JAG

----

Vertex failed, vertexName=Map 1, vertexId=vertex_1460493428694_0042_2_00, diagnostics=Task failed, taskId=task_1460493428694_0042_2_00_000010, diagnostics=TaskAttempt 0 failed, info=
Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=
DatanodeInfoWithStorage
10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:390)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:333)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:269)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:261)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1540)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more
 TaskAttempt 1 failed, info=Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage
10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:390)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:333)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:269)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:261)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1540)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more
 TaskAttempt 2 failed, info=Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage
10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:390)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:333)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:269)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:261)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1540)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more
 TaskAttempt 3 failed, info=Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage
10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:196)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:142)
at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:61)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:310)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148)
... 14 more
Caused by: java.io.IOException: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:251)
at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:193)
... 19 more
Caused by: java.io.IOException: Cannot obtain block length for LocatedBlock{BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135; getBlockSize()=2112; corrupt=false; offset=0; locs=DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK
}
at org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:390)
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:333)
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:269)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:261)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1540)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:303)
at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:299)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:767)
at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
... 20 more
 Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:1, Vertex vertex_1460493428694_0042_2_00 Map 1
 killed/failed due to:OWN_TASK_FAILURE
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1460493428694_0042_2_01, diagnostics=Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1460493428694_0042_2_01 Reducer 2
 killed/failed due to:OTHER_VERTEX_FAILURE

DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
4 REPLIES 4

Guru

This most likely has to do with the blocks that make up this data. Can you check if there are any exceptions coming from data logs.

Explorer

Hi @Emil / @Ravi Mutyala

Thank you very much for taken this issue.

Effectively I ran: #hdfs fsck / -files -blocks -locations an tried to find the file with the block BP-1826592285-10.230.3.7-1448557559182:blk_1073957941_217135 that is given the exception. The fsck said that filesystem is healthy.

However, the block is not listed as output of the previous command. It seems that exists a file not closed by Flume that it is not seen by fsck. I don´t know how to find the file with the problem in order to erase or closed it.

The logs give the same information posted above.

Best regards,

JAG

I have the same issue as @JOSE GUILLEN described.

Did anyone find a solution?

Rising Star
The located blocks in the exception stack have the datanode information about the block. In the case above, there is a single datanode {DatanodeInfoWithStorage10.230.3.7:50010,DS-f6d96d0f-41ce-4f1b-be1d-e75c0cae6471,DISK}.

You can login to the datanode and enter the datanode.data.dir. Do a find to see if you can find the block metadata and data file exist.

find . -name blk_1073957941_217135*