Support Questions

Find answers, ask questions, and share your expertise

Hive Job - OutOfMemoryError: Java heap space

avatar
New Contributor

I have a cluster, we are running INSERT OVERWRITE QUERY through HIVE CLI which fails with OutOfMemoryError: Java heap space.
I have tried multiple set parameters and updated the machine config but no luck.

set hive.exec.max.dynamic.partitions=4000;
set hive.exec.max.dynamic.partitions.pernode=500;
set hive.exec.dynamic.partition.mode=nonstrict;
SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;
set hive.tez.container.size=16384;
set tez.task.resource.memory.mb=16384;
set tez.am.resource.memory.mb=8192;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
set hive.support.concurrency=false;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.exec.orc.split.strategy=BI;
set hive.exec.reducers.max=150;

Error thrown

Status: Failed
Vertex failed, vertexName=Reducer 4, vertexId=vertex_1731327513546_0052_5_08, diagnostics=[Task failed, taskId=task_1731327513546_0052_5_08_000045, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
at java.base/java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:198)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:177)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.lambda$create$5(GoogleHadoopFileSystem.java:547)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem$$Lambda$273/0x000000080077e040.apply(Unknown Source)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$274/0x000000080077d040.apply(Unknown Source)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.create(GoogleHadoopFileSystem.java:521)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1234)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1211)
at org.apache.orc.impl.PhysicalFsWriter.<init>(PhysicalFsWriter.java:95)
at org.apache.orc.impl.WriterImpl.<init>(WriterImpl.java:187)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:94)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:334)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:95)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:990)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:816)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:504)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:661)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genJoinObject(CommonJoinOperator.java:533)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:936)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:331)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:294)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
at java.base/java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:198)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:177)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.lambda$create$5(GoogleHadoopFileSystem.java:547)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem$$Lambda$273/0x000000080077e040.apply(Unknown Source)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$274/0x000000080077d040.apply(Unknown Source)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.create(GoogleHadoopFileSystem.java:521)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1234)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1211)
at org.apache.orc.impl.PhysicalFsWriter.<init>(PhysicalFsWriter.java:95)
at org.apache.orc.impl.WriterImpl.<init>(WriterImpl.java:187)
at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:94)
at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:334)
at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:95)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:990)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:816)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:504)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:661)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genJoinObject(CommonJoinOperator.java:533)
at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:936)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:331)
at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:294)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:49, Vertex vertex_1731327513546_0052_5_08 [Reducer 4] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0```

 

1 ACCEPTED SOLUTION

avatar
Master Collaborator
  • The job failed with an OutOfMemoryError (OOME) at the child task attempt level, as indicated by the stacktrace.
  • It was observed that certain mapreduce properties have been set, which may potentially overwrite the hive.tez.container.size property.

 

SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;​



 

  • It is recommended to validate the yarn appLogs to confirm if the child task attempts were launched with 80% of the hive.tez.container.size. If not, it is advised to remove the mapreduce configurations and try re-running the job.
  • Before re-running the query, it is suggested to collect statistics for all the source tables. This will assist the optimizer in creating a better execution plan.

View solution in original post

2 REPLIES 2

avatar
Community Manager

@Keyboard00 Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Hive experts @james_jones @cravani  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Collaborator
  • The job failed with an OutOfMemoryError (OOME) at the child task attempt level, as indicated by the stacktrace.
  • It was observed that certain mapreduce properties have been set, which may potentially overwrite the hive.tez.container.size property.

 

SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;​



 

  • It is recommended to validate the yarn appLogs to confirm if the child task attempts were launched with 80% of the hive.tez.container.size. If not, it is advised to remove the mapreduce configurations and try re-running the job.
  • Before re-running the query, it is suggested to collect statistics for all the source tables. This will assist the optimizer in creating a better execution plan.