Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Out of Memory issue while executing simple Hive Query: java.lang.OutOfMemoryError: Java heap space

avatar
Explorer

Hive Table:

  1. Internal table
  2. ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' STORED AS TEXTFILE
  3. Data ingestion is successful
  4. Facing memory issues while querying!

Input file: JSON file

While researching on memory configuration, found some information here to calculate recommended memory configuration against available resources.

Our POC cluster size: 4 node cluster - 8 GB, 2 cores and 2 disks on each node

Please refer below information and let me know if memory needs to be extended or do TEZ memory configuration needs to be changed to overcome "out of memory" issue?

Calculated memory configuration uisng yarn-utils.py:

$ python yarn-utils.py -c 2 -m 8 -d 2 -k True

Using cores=2 memory=8GB disks=2 hbase=True

Profile: cores=2 memory=5120MB reserved=3GB usableMem=5GB disks=2

Num Container=4

Container Ram=1024MB

Used Ram=4GB

Unused Ram=3GB

yarn.scheduler.minimum-allocation-mb=1024

yarn.scheduler.maximum-allocation-mb=4096

yarn.nodemanager.resource.memory-mb=4096

mapreduce.map.memory.mb=1024

mapreduce.map.java.opts=-Xmx819m

mapreduce.reduce.memory.mb=2048

mapreduce.reduce.java.opts=-Xmx1638m

yarn.app.mapreduce.am.resource.mb=2048

yarn.app.mapreduce.am.command-opts=-Xmx1638m

mapreduce.task.io.sort.mb=409

Error Log:

<small> java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1542737559534_0026_1_00, diagnostics=[Task failed, taskId=task_1542737559534_0026_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:3664)
	at java.lang.String.<init>(String.java:207)
	at java.lang.StringBuilder.toString(StringBuilder.java:407)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:159)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
	at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOfRange(Arrays.java:3664)
	at java.lang.String.<init>(String.java:207)
	at java.lang.StringBuilder.toString(StringBuilder.java:407)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:563)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:88)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:73)
	at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:325)
	at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
	... 14 more</small>
1 ACCEPTED SOLUTION

avatar
Explorer

This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)

View solution in original post

2 REPLIES 2

avatar
Explorer

This issue is resolved by reducing container size (increase memory for paramers - yarn.scheduler.minimum-allocation-mb, and tez.task.resource.memory.mb)

avatar
New Contributor

Hi,

 

could you pleases tell me how to reducing container size. But, I executed query successfully but sometimes it got fails. please suggest it is very urgent issue going on.