Support Questions

connectchen · ‎12-10-2015

When I use Hive on tez to insert overwrite table from other table,get the follow error,it did not happped every time,sometime query succefully:

"Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1449486079177_5239_1_01, diagnostics=[Task failed, taskId=task_1449486079177_5239_1_01_000018, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task: attempt_1449486079177_5239_1_01_000018_0:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:157) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelinedSorter.java:173) at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelinedSorter.java:117) at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:141) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:141) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) ... 14 more ], TaskAttempt 1 failed, info=[Error: Failure while running task: attempt_1449486079177_5239_1_01_000018_1:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:157) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:348) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelinedSorter.java:173) at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.(PipelinedSorter.java:117) at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:141) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:141) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) ... 14 more ], TaskAttempt 2

deepesh1 · ‎12-11-2015

Try the following, lets assume your hive.tez.container.size=2048.

set hive.tez.java.opts=-Xmx1640m (0.8 times hive.tez.container.size)

set tez.runtime.io.sort.mb=820 (0.4 times hive.tez.container.size)

set tez.runtime.unordered.output.buffer.size-mb=205 (0.1 times hive.tez.container.size)

View solution in original post

vt0084883 · ‎12-10-2015

If the sum of data sizes is greater than the amount of memory reserved for the hash tables (see below config param), then it happens.

hive.auto.convert.join.noconditionaltask.size=1370MB

connectchen · ‎12-11-2015

It didn't work after set hive.auto.convert.join.noconditionaltask.size=1436549120.After set hive.tez.container.size=2048,set hive.tez.java.opts=-Xmx1700m,the OOM problem is solved.

deepesh1 · ‎12-11-2015

Try the following, lets assume your hive.tez.container.size=2048.

set hive.tez.java.opts=-Xmx1640m (0.8 times hive.tez.container.size)

set tez.runtime.io.sort.mb=820 (0.4 times hive.tez.container.size)

set tez.runtime.unordered.output.buffer.size-mb=205 (0.1 times hive.tez.container.size)

chen_zhi · ‎04-16-2016

I am running on Azure using Maria_dev login, where could I input this 3 lines of code? In the Tez config, it is all locked and cannot be edited.

Neethu PL · ‎03-04-2016

Total jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1455546410616_13085, Tracking URL = http://ndrm:8088/proxy/application_1455546410616_13085/ Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1455546410616_13085 Hadoop job information for Stage-1: number of mappers: 7; number of reducers: 0 2016-03-03 13:39:54,224 Stage-1 map = 0%, reduce = 0% 2016-03-03 13:40:04,733 Stage-1 map = 57%, reduce = 0%, Cumulative CPU 13.0 sec 2016-03-03 13:40:26,943 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 112.9 sec 2016-03-03 13:40:30,114 Stage-1 map = 96%, reduce = 0%, Cumulative CPU 142.98 sec 2016-03-03 13:40:48,010 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 104.61 sec 2016-03-03 13:41:22,610 Stage-1 map = 96%, reduce = 0%, Cumulative CPU 142.05 sec 2016-03-03 13:41:40,425 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 104.61 sec 2016-03-03 13:42:16,026 Stage-1 map = 96%, reduce = 0%, Cumulative CPU 143.26 sec 2016-03-03 13:42:34,857 Stage-1 map = 86%, reduce = 0%, Cumulative CPU 104.61 sec 2016-03-03 13:43:09,393 Stage-1 map = 96%, reduce = 0%, Cumulative CPU 144.34 sec 2016-03-03 13:43:28,197 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 104.61 sec MapReduce Total cumulative CPU time: 1 minutes 44 seconds 610 msec Ended Job = job_1455546410616_13085 with errors Error during job, obtaining debugging information... Examining task ID: task_1455546410616_13085_m_000003 (and more) from job job_1455546410616_13085 Task with the most failures(4): ----- Task ID: task_1455546410616_13085_m_000000 URL: http://ndrm:8088/taskdetails.jsp?jobid=job_1455546410616_13085&tipid=task_1455546410616_13085_m_0000... ----- Diagnostic Messages for this Task: Error: Java heap space

Increasing the JVM memory and the map memory allocated by the container helped for me .

below are the values used:

hive> set mapreduce.map.memory.mb=4096;

hive >set mapreduce.map.java.opts=-Xmx3600M;

Incase you still get the Java heap error , try increasing to higher values, but make sure that the mapreduce.map.java.opts doesnt exceed mapreduce.map.memory.mb.

well in case of tez you may have to set set hive.tez.java.opts=-Xmx3600M;

Thanks

masoud · ‎04-29-2016

@Jun Chen

ssh to your server and open /etc/tez/conf/tez-site.xml and make these changes, if it did not work try larger values:

tez.am.resource.memory.mb > 768
tez.task.resource.memory.mb > 768
tez.am.java.opts: > -Xmx560m -Xms560m

the same for /etc/tez/conf/hive-site.xml

hive.tez.container.size: -> 768
hive.tez.java.opts: -> -Xmx560m -Xms560m

Then run

$> su hive

$> hive

and run your query.

richard_d_corfi · ‎02-15-2017

Thanks, your solution worked for me - but there's a minor typo, I think you mean /etc/hive/conf/hive-site.xml for the second file, not /etc/tez/conf/hive-site.xml.

carlos_guimarae · ‎03-20-2017

Worked supper! Thank U.

dz902 · ‎02-08-2023

I'm running into the same situation:

 Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space
  at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
  at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.allocateSpace(PipelinedSorter.java:250)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.end(PipelinedSorter.java:1054)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.next(PipelinedSorter.java:1009)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:318)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:423)
  at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:379)
  at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:167)
  at org.apache.hadoop.hive.ql.exec.tez.TezProcessor$TezKVOutputCollector.collect(TezProcessor.java:355)
  at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:511)
  at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.process(ReduceSinkOperator.java:367)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.flushHashTable(GroupByOperator.java:998)
  at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:750)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.internalForward(CommonJoinOperator.java:825)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:857)
  at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:941)
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:590)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:126)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
  at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
  at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
  at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)

I'm not sure why do we need to manually set container or buffer size? Shouldn't Tez do the calculations and use only what's available?

Cloudera Community

Support Questions

Hive on Tez query Map output OutOfMemoryError: Java heap space at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)