Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

I am getting outofmemory while inserting the data into table,try increasing java heap but it wont help

avatar
New Contributor

FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107) at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:223) at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239) at org.apache.hadoop.hive.ql.io.orc.RunLengthByteWriter.flush(RunLengthByteWriter.java:58) at org.apache.hadoop.hive.ql.io.orc.BitFieldWriter.flush(BitFieldWriter.java:44) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:553) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:165) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

1 ACCEPTED SOLUTION

avatar
Super Guru

this is due to the memory required by orc writer while writing orc files, you can limit the memory use by tweaking the value of orc.compress.size which is of 256KB by default.I am not sure about your heap size, start testing with 8KB of buffer using

alter table table_name set tblproperties("orc.compress.size"="8192")

and see if it helps.

View solution in original post

2 REPLIES 2

avatar
Super Guru

this is due to the memory required by orc writer while writing orc files, you can limit the memory use by tweaking the value of orc.compress.size which is of 256KB by default.I am not sure about your heap size, start testing with 8KB of buffer using

alter table table_name set tblproperties("orc.compress.size"="8192")

and see if it helps.

avatar
Master Guru

If you say that increasing the heap doesn't help are we talking about decent sizes like 8GB+? Also did you increase the java opts AND the container size?

set hive.tez.java.opts="-Xmx3400m";

set hive.tez.container.size = 4096;

If yes then you most likely have a different problem like for example loading data into a partitioned table. ORC writers keep one buffer open for every output file. So if you load badly to a partitioned table they will keep a lot of memory open. There are ways around it like optimized sorted load or the distribute by keyword.

http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data

If however you use significantly less than 4-8GB for the task then you should increase that.