Support Questions

Find answers, ask questions, and share your expertise

HIVE job failed on TEZ

avatar
Rising Star

HIVE job failed while using Tez as the engine

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

____________________________________________________

TaskAttempt 3 failed, info=[Container container_1457392972594_9109_01_000505 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=75464,containerID=container_1457392972594_9109_01_000505] is running beyond physical memory limits. Current usage: 4.0 GB of 4 GB physical memory used; 7.4 GB of 8.4 GB virtual memory used. Killing container.

Dump of the process-tree for container_1457392972594_9109_01_000505 :

Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:250, Vertex vertex_1457392972594_8881_1_25 [Reducer 12] killed/failed due to:OWN_TASK_FAILURE]

ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:11

Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 27, vertexId=vertex_1457392972594_8881_1_00Vertex re-running, vertexName=Map 25, vertexId=vertex_1457392972594_8881_1_05Vertex re-running, vertexName=Map 7, vertexId=vertex_1457392972594_8881_1_11Vertex re-running, vertexName=Map 16, vertexId=vertex_1457392972594_8881_1_16Vertex re-running, vertexName=Reducer 17, vertexId=vertex_1457392972594_8881_1_17Vertex re-running, vertexName=Map 5, vertexId=vertex_1457392972594_8881_1_12Vertex re-running, vertexName=Map 23, vertexId=vertex_1457392972594_8881_1_03Vertex re-running, vertexName=Map 14, vertexId=vertex_1457392972594_8881_1_23Vertex re-running, vertexName=Map 27, vertexId=vertex_1457392972594_8881_1_00Vertex re-running, vertexName=Map 20, vertexId=vertex_1457392972594_8881_1_02Vertex re-running, vertexName=Map 9, vertexId=vertex_1457392972594_8881_1_09Vertex re-running, vertexName=Map 18, vertexId=vertex_1457392972594_8881_1_13Vertex re-running, vertexName=Map 5, vertexId=vertex_1457392972594_8881_1_12Vertex re-running, vertexName=Map 18, vertexId=vertex_1457392972594_8881_1_13Vertex re-running, vertexName=Map 27, vertexId=vertex_1457392972594_8881_1_00Vertex re-running, vertexName=Map 23, vertexId=vertex_1457392972594_8881_1_03Vertex re-running, vertexName=Map 14, vertexId=vertex_1457392972594_8881_1_23Vertex re-running, vertexName=Map 24, vertexId=vertex_1457392972594_8881_1_04Vertex re-running, vertexName=Reducer 8, vertexId=vertex_1457392972594_8881_1_14Vertex failed, vertexName=Reducer 12, vertexId=vertex_1457392972594_8881_1_25, diagnostics=[Task failed, taskId=task_1457392972594_8881_1_25_000011, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space

Caused by: java.lang.OutOfMemoryError: Java heap space

TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space

Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:250, Vertex vertex_1457392972594_8881_1_25 [Reducer 12] killed/failed due to:OWN_TASK_FAILURE]

[Task failed, taskId=t

ask_1457392972594_8881_1_25_000011, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143

]], TaskAttempt 3 failed, info=[Container container_1457392972594_9109_01_000505 finished with diagnostics set to [Container failed, exitCode=-104. Container [pid=75464,containerID=container_1457392972594_9109_01_000505] is running beyond physical memory limits. Current usage: 4.0 GB of 4 GB physical memory used; 7.4 GB of 8.4 GB virtual memory used. Killing container.

set hive.execution.engine=tez; --set hive.execution.engine=mr;

set role admin;

set hive.support.sql11.reserved.keywords=false;

set hive.vectorized.execution.enabled=false;

Can someone please help

9 REPLIES 9

avatar
Master Mentor

Your job ran out of memory, please follow this guide to configure memory for your cluster https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_installing_manually_book/content/determi...

If you have specific query tuning question, please provide the query

avatar
Guru

Need to see the query statement and your current Hive/Tez/Yarm settings in order find the issue. It look like one of your containers is running out of memory but 4GB is a decent size. You may need to address your query structure or you may simply need more reducers or to increase the container maximum heap size that Tez can request. You will need to check your Yarn container sizing as well since Tez cannot ask for a larger container than Yarn will allow...

avatar
Master Guru

Also verify CBO is turned on and you are utilizing vectorization (helps with memory). Run a explain plan on the query to determine what types of joins are being used.

avatar
Rising Star

Any one aware of this error.

[Reducer 17] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 16, vertexId=vertex_1457392972594_12162_1_16, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:5, Vertex vertex_1457392972594_12162_1_16 [Map 16] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:11 (state=08S01,code=2)
Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:45, Vertex vertex_1457392972594_12162_1_21 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, 4:32 AM

regards

bk

avatar
New Contributor

Hi 

I am running one hql script (through hive -f) which has count(*) or count(1) queries of 40 tables. But when I am running often script is failing with below error. sometimes it fails at mid level(after 10 queries executed) and sometimes at very initial level(after 2 executes). I would really appreciate if you could provide some solution for the same. Thank you in advance


Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1565763981051_1182_16_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1565763981051_1182_16_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1565763981051_1182_16_00Vertex failed, vertexName=Map 1, vertexId=vertex_1565763981051_1182_16_00, diagnostics=[Vertex 

 

Mahesh

avatar
New Contributor

Hi, I am facing the same issue. Were you able to resolve it ?

Thanks in Advance.

avatar
Community Manager

Shubham_Ranjan

 

As this is an older post you would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question. 


Cy Jervis, Manager, Community Program
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

Suresh_b_k, Is the issue got resolved? If Yes, can you please let me know what resolved your issue.

avatar
Community Manager

@jagan20, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: