Support Questions

Keyboard00 · ‎11-11-2024

I have a cluster, we are running INSERT OVERWRITE QUERY through HIVE CLI which fails with OutOfMemoryError: Java heap space.
I have tried multiple set parameters and updated the machine config but no luck.

set hive.exec.max.dynamic.partitions=4000;
set hive.exec.max.dynamic.partitions.pernode=500;
set hive.exec.dynamic.partition.mode=nonstrict;
SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;
set hive.tez.container.size=16384;
set tez.task.resource.memory.mb=16384;
set tez.am.resource.memory.mb=8192;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
set hive.support.concurrency=false;
set hive.vectorized.execution.enabled=true;
set hive.vectorized.execution.reduce.enabled=true;
set hive.exec.orc.split.strategy=BI;
set hive.exec.reducers.max=150;

Error thrown

Status: Failed
Vertex failed, vertexName=Reducer 4, vertexId=vertex_1731327513546_0052_5_08, diagnostics=[Task failed, taskId=task_1731327513546_0052_5_08_000045, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
        at java.base/java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:198)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:177)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.lambda$create$5(GoogleHadoopFileSystem.java:547)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem$$Lambda$273/0x000000080077e040.apply(Unknown Source)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$274/0x000000080077d040.apply(Unknown Source)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.create(GoogleHadoopFileSystem.java:521)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1234)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1211)
        at org.apache.orc.impl.PhysicalFsWriter.<init>(PhysicalFsWriter.java:95)
        at org.apache.orc.impl.WriterImpl.<init>(WriterImpl.java:187)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:94)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:334)
        at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:95)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:990)
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:816)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:504)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:661)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genJoinObject(CommonJoinOperator.java:533)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:936)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:331)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:294)
, errorMessage=Cannot recover from this error:java.lang.OutOfMemoryError: Java heap space
        at java.base/java.io.BufferedOutputStream.<init>(BufferedOutputStream.java:75)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.createOutputStream(GoogleHadoopOutputStream.java:198)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopOutputStream.<init>(GoogleHadoopOutputStream.java:177)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.lambda$create$5(GoogleHadoopFileSystem.java:547)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem$$Lambda$273/0x000000080077e040.apply(Unknown Source)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding$$Lambda$274/0x000000080077d040.apply(Unknown Source)
        at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
        at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.create(GoogleHadoopFileSystem.java:521)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1234)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1211)
        at org.apache.orc.impl.PhysicalFsWriter.<init>(PhysicalFsWriter.java:95)
        at org.apache.orc.impl.WriterImpl.<init>(WriterImpl.java:187)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:94)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:334)
        at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:95)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:990)
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
        at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:816)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.createForwardJoinObject(CommonJoinOperator.java:504)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genObject(CommonJoinOperator.java:661)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genJoinObject(CommonJoinOperator.java:533)
        at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:936)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinObject(CommonMergeJoinOperator.java:331)
        at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinOneGroup(CommonMergeJoinOperator.java:294)
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:49, Vertex vertex_1731327513546_0052_5_08 [Reducer 4] killed/failed due to:OWN_TASK_FAILURE]
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0```

ggangadharan · ‎11-22-2024

The job failed with an OutOfMemoryError (OOME) at the child task attempt level, as indicated by the stacktrace.
It was observed that certain mapreduce properties have been set, which may potentially overwrite the hive.tez.container.size property.

SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;

It is recommended to validate the yarn appLogs to confirm if the child task attempts were launched with 80% of the hive.tez.container.size. If not, it is advised to remove the mapreduce configurations and try re-running the job.
Before re-running the query, it is suggested to collect statistics for all the source tables. This will assist the optimizer in creating a better execution plan.

View solution in original post

DianaTorres · ‎11-11-2024

@Keyboard00 Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Hive experts @james_jones @cravani who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.

Regards,

Diana Torres,
Senior Community Moderator

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Community Guidelines
How to use the forum

ggangadharan · ‎11-22-2024

The job failed with an OutOfMemoryError (OOME) at the child task attempt level, as indicated by the stacktrace.
It was observed that certain mapreduce properties have been set, which may potentially overwrite the hive.tez.container.size property.

SET mapreduce.map.java.opts=-Xmx3686m;
SET mapreduce.reduce.java.opts=-Xmx3686m;
SET mapred.child.java.opts=-Xmx10g;

It is recommended to validate the yarn appLogs to confirm if the child task attempts were launched with 80% of the hive.tez.container.size. If not, it is advised to remove the mapreduce configurations and try re-running the job.
Before re-running the query, it is suggested to collect statistics for all the source tables. This will assist the optimizer in creating a better execution plan.

Support Questions

Hive Job - OutOfMemoryError: Java heap space