Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space

Solved Go to solution

Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space

New Contributor

Hive Table:

  1. Internal table
  2. ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE
  3. Data ingestion is successful
  4. Facing memory issues while querying!

Input file: JSON file

While researching on memory configuration, found some information here to calculate recommended memory configuration against available resources.

Our POC cluster size: 4 node cluster - 8 GB, 2 cores and 2 disks on each node

Please refer below information and let me know if memory needs to be extended or do TEZ memory configuration needs to be changed to overcome "out of memory" issue?

Calculated memory configuration uisng yarn-utils.py:

$ python yarn-utils.py -c 2 -m 8 -d 2 -k True

Using cores=2 memory=8GB disks=2 hbase=True

Profile: cores=2 memory=5120MB reserved=3GB usableMem=5GB disks=2

Num Container=4

Container Ram=1024MB

Used Ram=4GB

Unused Ram=3GB

yarn.scheduler.minimum-allocation-mb=1024

yarn.scheduler.maximum-allocation-mb=4096

yarn.nodemanager.resource.memory-mb=4096

mapreduce.map.memory.mb=1024

mapreduce.map.java.opts=-Xmx819m

mapreduce.reduce.memory.mb=2048

mapreduce.reduce.java.opts=-Xmx1638m

yarn.app.mapreduce.am.resource.mb=2048

yarn.app.mapreduce.am.command-opts=-Xmx1638m

mapreduce.task.io.sort.mb=409

Error Log:

<small> java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1542737559534_0026_1_00, diagnostics=[Task failed, taskId=task_1542737559534_0026_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:3664)
	at java.lang.String.<init>(String.java:207)
	at java.lang.StringBuilder.toString(StringBuilder.java:407)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:3664)
	at java.lang.String.<init>(String.java:207)
	at java.lang.StringBuilder.toString(StringBuilder.java:407)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more</small>
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space

New Contributor

This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)

1 REPLY 1
Highlighted

Re: Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space

New Contributor

This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)

Don't have an account?
Coming from Hortonworks? Activate your account here