- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space
- Labels:
-
Apache Hive
-
Apache Tez
-
Apache YARN
Created ‎01-08-2019 07:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hive Table:
- Internal table
- ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE
- Data ingestion is successful
- Facing memory issues while querying!
Input file: JSON file
While researching on memory configuration, found some information here to calculate recommended memory configuration against available resources.
Our POC cluster size: 4 node cluster - 8 GB, 2 cores and 2 disks on each node
Please refer below information and let me know if memory needs to be extended or do TEZ memory configuration needs to be changed to overcome "out of memory" issue?
Calculated memory configuration uisng yarn-utils.py:
$ python yarn-utils.py -c 2 -m 8 -d 2 -k True
Using cores=2 memory=8GB disks=2 hbase=True
Profile: cores=2 memory=5120MB reserved=3GB usableMem=5GB disks=2
Num Container=4
Container Ram=1024MB
Used Ram=4GB
Unused Ram=3GB
yarn.scheduler.minimum-allocation-mb=1024
yarn.scheduler.maximum-allocation-mb=4096
yarn.nodemanager.resource.memory-mb=4096
mapreduce.map.memory.mb=1024
mapreduce.map.java.opts=-Xmx819m
mapreduce.reduce.memory.mb=2048
mapreduce.reduce.java.opts=-Xmx1638m
yarn.app.mapreduce.am.resource.mb=2048
yarn.app.mapreduce.am.command-opts=-Xmx1638m
mapreduce.task.io.sort.mb=409
Error Log:
<small> java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1542737559534_0026_1_00, diagnostics=[Task failed, taskId=task_1542737559534_0026_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:207) at java.lang.StringBuilder.toString(StringBuilder.java:407) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more ], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3664) at java.lang.String.<init>(String.java:207) at java.lang.StringBuilder.toString(StringBuilder.java:407) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more</small>
Created ‎01-11-2019 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)
Created ‎01-11-2019 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)
Created ‎01-24-2022 02:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
could you pleases tell me how to reducing container size. But, I executed query successfully but sometimes it got fails. please suggest it is very urgent issue going on.
