Support Questions

Find answers, ask questions, and share your expertise

java.io.IOException: Error reading file

avatar
Explorer

I'm unable to query the partitioned table in Hive with null values. I'm getting the results from ORC Formatted table when I use the simple select with limit 10. when I query the table to filter out null values (col_name != ' ') . The file is having enough permissions .

Please let me know where i'm going wrong .

it throws the below error :

	Diagnostic Messages for this Task: Error: java.io.IOException: java.io.IOException: java.io.IOException: Error reading file: hdfs://filename         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:227) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:137) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: java.io.IOException: Error reading file: hdfs://filename         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:106) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:225) ... 11 more Caused by: java.io.IOException: 	Error reading file: hdfs:/filename         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1051) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:171) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:145) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350) ... 15 more Caused by: java.io.IOException: Seek outside of data in compressed stream Stream for column 29 kind DATA position: 15982 length: 130886 range: 0 offset: 17609 limit: 17609 range 0 = 0 to 15982; range 1 = 106407 to 24479 uncompressed: 128 to 128 to 15982 at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.seek(InStream.java:365) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:182) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:238) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readLongBE(SerializationUtils.java:1139) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.unrolledUnPackBytes(SerializationUtils.java:1055) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.unrolledUnPack16(SerializationUtils.java:1014) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:884) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:255) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:62) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.next(TreeReaderFactory.java:1712) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.next(TreeReaderFactory.java:1397) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1044) ... 18 more Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: Map: 6487 Cumulative CPU: 7812.99 sec HDFS Read: 2685196842 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 days 2 hours 10 minutes 12 seconds 990 msec 
4 REPLIES 4

avatar
Master Mentor

@Raj ji

The Error says:

Caused by: java.io.IOException: java.io.IOException: Error reading file: hdfs://filename

.

Usually the HDFS file path is in some format like:

hdfs://NAMENODE_HOSTNAME:PORT/filename

OR in case of NameNode HA it is the service name like following:

hdfs://HA_SERVICE_NAME:/filename

.

Looks like in your case the HANameService (clustername) OR the NameNode address is missing.

avatar
Explorer

I've replaced the actual hdfs location with hdfs://filename .

avatar
Master Mentor

@Raj ji

Are you using the ORC writer APIs directly or using custom MapReduce program to write ORC files? If yes then you need to make sure the row that you are writing to ORC is not null.

The ORC writer does not expect null as a row. The columns within a row can be null though. This can be verified from orcfiledump.

Example:

writer.addRow(null); // invalid
writer.addRow([a, b, null, d]); // valid
writer.addRow([null, null, null, null]); // valid


Above statement is taken as a reference from: https://issues.apache.org/jira/browse/HIVE-5922 (Which is not considerd as a bug)

avatar
Explorer

I'm running a hive query and it will create a MR . The table is partitioned table and ORC formatted table.I'm not trying to insert values into the tables . I need to filter not null values from the table . When I tried to do that I'm getting the above error. Still couldn't figure out why .