Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

I am getting outofmemory while inserting the data into table,try increasing java heap but it wont help

New Contributor

FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107) at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:223) at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239) at org.apache.hadoop.hive.ql.io.orc.RunLengthByteWriter.flush(RunLengthByteWriter.java:58) at org.apache.hadoop.hive.ql.io.orc.BitFieldWriter.flush(BitFieldWriter.java:44) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:553) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012) at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780) at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040) at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:165) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:843) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:577) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:227) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)

1 ACCEPTED SOLUTION

this is due to the memory required by orc writer while writing orc files, you can limit the memory use by tweaking the value of orc.compress.size which is of 256KB by default.I am not sure about your heap size, start testing with 8KB of buffer using

alter table table_name set tblproperties("orc.compress.size"="8192")

and see if it helps.

View solution in original post

2 REPLIES 2

this is due to the memory required by orc writer while writing orc files, you can limit the memory use by tweaking the value of orc.compress.size which is of 256KB by default.I am not sure about your heap size, start testing with 8KB of buffer using

alter table table_name set tblproperties("orc.compress.size"="8192")

and see if it helps.

If you say that increasing the heap doesn't help are we talking about decent sizes like 8GB+? Also did you increase the java opts AND the container size?

set hive.tez.java.opts="-Xmx3400m";

set hive.tez.container.size = 4096;

If yes then you most likely have a different problem like for example loading data into a partitioned table. ORC writers keep one buffer open for every output file. So if you load badly to a partitioned table they will keep a lot of memory open. There are ways around it like optimized sorted load or the distribute by keyword.

http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data

If however you use significantly less than 4-8GB for the task then you should increase that.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.