Created 02-22-2016 01:14 PM
Running this statement
INSERT INTO TABLE FOO PARTITION(partition_date) SELECT DISTINCT [columns from BAR] FROM BAR left outer join FOO ON (BAR.application.id = FOO.unique_id) where FOO.unique_id is null
fails with the stack trace below. The only setting I could find that seemed relevant was hive.exec.orc.default.buffer.size, but I confirmed that is already set to the default value of 262,144. FOO has about 3.8B rows and is an ORC table, BAR is an external avro table. I'm running on HDP 2.3.4 with Hive 1.2.1
Anyone have suggestions for addressing this?
Caused by: java.lang.IllegalArgumentException: Buffer size too small. size = 32768 needed = 146215 at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:193) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:238) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDirectTreeReader.next(TreeReaderFactory.java:1554) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.next(TreeReaderFactory.java:1397) at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1039) at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.next(OrcRawRecordMerger.java:249) at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$ReaderPair.<init>(OrcRawRecordMerger.java:186) at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.<init>(OrcRawRecordMerger.java:226) at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.<init>(OrcRawRecordMerger.java:437) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:1269) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1151) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:249)
Created 02-22-2016 01:23 PM
try to increase below property in core_site.xml
<property>
<name>io.file.buffer.size</name>
<value>146215</value>
</property>
Created 02-22-2016 01:45 PM
Thanks @Divakar Annapureddy! I checked and that value is currently 131072 in core-site. I tried overriding it in Hive with "set io.file.buffer.size=146215" and got the same error message. In other words, it still has a buffer size of 32K and not the value in core-site or what I set through hive.
Created 02-22-2016 02:09 PM
Can you please try this as well link
Created 02-22-2016 03:21 PM
Interesting, it looks like I'm seeing a similar error in a different context (my version of hive doesn't have any of the LLAP functionality, as i understand it).
Created 03-02-2016 11:22 PM
Thought that it was HIVE-12450 OrcFileMergeOperator does not use correct compression buffer size.
But perhaps there is still a problem here.