Support Questions
Find answers, ask questions, and share your expertise

Tez jobs says running and hangs up

Expert Contributor

Hi all,

I have a Hive query on Tez running every hour. It is successful all the time except it was just running continuously in a hung up state 2 times till now, Dec28th and Jan 16th. It is part of a script with 10 queries, i have problem with same query both the times. Here are the errors I found in the AM log.. Please let me know if you need any more information.

  1. 2017-01-1613:50:23,412[INFO][IPC Server handler 12 on 57912]|app.TaskAttemptListenerImpTezDag|:Containerwith id: container_e08_1479141851984_20667_01_000109 given task: attempt_1479141851984_20667_1_09_000000_0
  2. 2017-01-1613:50:23,416[INFO][IPC Server handler 6 on 57912]|impl.TaskImpl|:TaskAttempt:attempt_1479141851984_20667_1_09_000000_0 sent events:(0-1).
  3. 2017-01-1613:50:23,416[INFO][IPC Server handler 6 on 57912]|impl.VertexImpl|:Sending attempt_1479141851984_20667_1_09_000000_0 1 events [0,1) total 1 vertex_1479141851984_20667_1_09 [Map1]
  4. 2017-01-1613:50:24,432[INFO][IPC Server handler 12 on 57912]|common.AsyncDispatcher|:Size of event-queue is1000
  5. 2017-01-1613:50:32,528[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000023, containerExpiryTime=1484592632331, idleTimeout=10000, taskRequestsCount=0, heldContainers=174, delayedContainers=38, isNew=false
  6. 2017-01-16 13:50:32,536 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000011, containerExpiryTime=1484592632403, idleTimeout=10000, taskRequestsCount=0, heldContainers=173, delayedContainers=37, isNew=false
  7. 2017-01-1613:50:32,800[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000012, containerExpiryTime=1484592632689, idleTimeout=10000, taskRequestsCount=0, heldContainers=172, delayedContainers=36, isNew=false
  8. 2017-01-16 13:50:33,074 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000052, containerExpiryTime=1484592633008, idleTimeout=10000, taskRequestsCount=0, heldContainers=171, delayedContainers=35, isNew=false
  9. 2017-01-1613:50:33,323[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000159, containerExpiryTime=1484592633197, idleTimeout=10000, taskRequestsCount=0, heldContainers=170, delayedContainers=34, isNew=false
  10. 2017-01-16 13:50:33,436 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000102, containerExpiryTime=1484592633414, idleTimeout=10000, taskRequestsCount=0, heldContainers=169, delayedContainers=33, isNew=false
  11. 2017-01-1613:50:34,102[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000139, containerExpiryTime=1484592634007, idleTimeout=10000, taskRequestsCount=0, heldContainers=168, delayedContainers=32, isNew=false
  12. 2017-01-16 13:50:34,154 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000091, containerExpiryTime=1484592634145, idleTimeout=10000, taskRequestsCount=0, heldContainers=167, delayedContainers=31, isNew=false
  13. 2017-01-1613:50:34,802[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000088, containerExpiryTime=1484592634570, idleTimeout=10000, taskRequestsCount=0, heldContainers=166, delayedContainers=30, isNew=false
  14. 2017-01-16 13:50:35,339 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000100, containerExpiryTime=1484592635334, idleTimeout=10000, taskRequestsCount=0, heldContainers=165, delayedContainers=29, isNew=false
  15. 2017-01-1613:50:35,430[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000045, containerExpiryTime=1484592635349, idleTimeout=10000, taskRequestsCount=0, heldContainers=164, delayedContainers=28, isNew=false
  16. 2017-01-16 13:50:35,465 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000032, containerExpiryTime=1484592635279, idleTimeout=10000, taskRequestsCount=0, heldContainers=163, delayedContainers=27, isNew=false
  17. 2017-01-1613:50:35,530[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000062, containerExpiryTime=1484592635489, idleTimeout=10000, taskRequestsCount=0, heldContainers=162, delayedContainers=26, isNew=false
  18. 2017-01-16 13:50:35,714 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000198, containerExpiryTime=1484592635621, idleTimeout=10000, taskRequestsCount=0, heldContainers=161, delayedContainers=25, isNew=false
  19. 2017-01-1613:50:36,015[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000156, containerExpiryTime=1484592635770, idleTimeout=10000, taskRequestsCount=0, heldContainers=160, delayedContainers=24, isNew=false
  20. 2017-01-16 13:50:36,031 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000092, containerExpiryTime=1484592635925, idleTimeout=10000, taskRequestsCount=0, heldContainers=159, delayedContainers=23, isNew=false
  21. 2017-01-1613:50:36,476[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000103, containerExpiryTime=1484592636262, idleTimeout=10000, taskRequestsCount=0, heldContainers=158, delayedContainers=22, isNew=false
  22. 2017-01-16 13:50:36,515 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000184, containerExpiryTime=1484592636434, idleTimeout=10000, taskRequestsCount=0, heldContainers=157, delayedContainers=21, isNew=false
  23. 2017-01-1613:50:36,683[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000047, containerExpiryTime=1484592636554, idleTimeout=10000, taskRequestsCount=0, heldContainers=156, delayedContainers=20, isNew=false
  24. 2017-01-16 13:50:36,701 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000199, containerExpiryTime=1484592636659, idleTimeout=10000, taskRequestsCount=0, heldContainers=155, delayedContainers=19, isNew=false
  25. 2017-01-1613:50:37,352[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000014, containerExpiryTime=1484592637215, idleTimeout=10000, taskRequestsCount=0, heldContainers=154, delayedContainers=18, isNew=false
  26. 2017-01-16 13:50:37,804 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000147, containerExpiryTime=1484592637638, idleTimeout=10000, taskRequestsCount=0, heldContainers=153, delayedContainers=17, isNew=false
  27. 2017-01-1613:50:38,031[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000157, containerExpiryTime=1484592637925, idleTimeout=10000, taskRequestsCount=0, heldContainers=152, delayedContainers=16, isNew=false
  28. 2017-01-16 13:50:38,090 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000013, containerExpiryTime=1484592637894, idleTimeout=10000, taskRequestsCount=0, heldContainers=151, delayedContainers=15, isNew=false
  29. 2017-01-1613:50:38,476[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000021, containerExpiryTime=1484592638355, idleTimeout=10000, taskRequestsCount=0, heldContainers=150, delayedContainers=14, isNew=false
  30. 2017-01-16 13:50:38,804 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000104, containerExpiryTime=1484592638562, idleTimeout=10000, taskRequestsCount=0, heldContainers=149, delayedContainers=13, isNew=false
  31. 2017-01-1613:50:38,976[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000083, containerExpiryTime=1484592638820, idleTimeout=10000, taskRequestsCount=0, heldContainers=148, delayedContainers=12, isNew=false
  32. 2017-01-16 13:50:40,049 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000048, containerExpiryTime=1484592640020, idleTimeout=10000, taskRequestsCount=0, heldContainers=147, delayedContainers=11, isNew=false
  33. 2017-01-1613:50:40,077[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000006, containerExpiryTime=1484592639895, idleTimeout=10000, taskRequestsCount=0, heldContainers=146, delayedContainers=10, isNew=false
  34. 2017-01-16 13:50:40,604 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000143, containerExpiryTime=1484592640531, idleTimeout=10000, taskRequestsCount=0, heldContainers=145, delayedContainers=9, isNew=false
  35. 2017-01-1613:50:41,050[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000099, containerExpiryTime=1484592640988, idleTimeout=10000, taskRequestsCount=0, heldContainers=144, delayedContainers=8, isNew=false
  36. 2017-01-16 13:50:41,050 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000195, containerExpiryTime=1484592640891, idleTimeout=10000, taskRequestsCount=0, heldContainers=143, delayedContainers=7, isNew=false
  37. 2017-01-1613:50:41,478[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000101, containerExpiryTime=1484592641352, idleTimeout=10000, taskRequestsCount=0, heldContainers=142, delayedContainers=6, isNew=false
  38. 2017-01-16 13:50:41,805 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000174, containerExpiryTime=1484592641798, idleTimeout=10000, taskRequestsCount=0, heldContainers=141, delayedContainers=5, isNew=false
  39. 2017-01-1613:50:42,555[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000126, containerExpiryTime=1484592642429, idleTimeout=10000, taskRequestsCount=0, heldContainers=140, delayedContainers=4, isNew=false
  40. 2017-01-16 13:50:42,591 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000002, containerExpiryTime=1484592642472, idleTimeout=10000, taskRequestsCount=0, heldContainers=139, delayedContainers=3, isNew=false
  41. 2017-01-1613:50:42,754[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000175, containerExpiryTime=1484592642589, idleTimeout=10000, taskRequestsCount=0, heldContainers=138, delayedContainers=2, isNew=false
  42. 2017-01-16 13:50:43,077 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: No taskRequests. Container's idle timeout delay expired orisnew.Releasing container, containerId=container_e08_1479141851984_20667_01_000064, containerExpiryTime=1484592642946, idleTimeout=10000, taskRequestsCount=0, heldContainers=137, delayedContainers=1, isNew=false
  43. 2017-01-1613:50:43,083[INFO][DelayedContainerManager]|rm.YarnTaskSchedulerService|:No taskRequests.Container's idle timeout delay expired or is new. Releasing container, containerId=container_e08_1479141851984_20667_01_000005, containerExpiryTime=1484592642937, idleTimeout=10000, taskRequestsCount=0, heldContainers=136, delayedContainers=0, isNew=false
  44. 2017-01-16 15:58:34,227 [INFO] [Thread-3] |app.DAGAppMaster|: DAGAppMasterShutdownHook invoked
  45. 2017-01-16 15:58:34,228 [INFO] [Thread-3] |app.DAGAppMaster|: DAGAppMaster received a signal. Signaling TaskScheduler
  46. 2017-01-16 15:58:34,228 [INFO] [Thread-3] |rm.TaskSchedulerEventHandler|: TaskScheduler notified that iSignalled was : true
  47. 2017-01-16 15:58:34,228 [INFO] [Thread-3] |rm.YarnTaskSchedulerService|: Initiating stop of YarnTaskScheduler
  48. 2017-01-16 15:58:34,228 [INFO] [Thread-3] |rm.YarnTaskSchedulerService|: Releasing held containers
  49. 2017-01-16 15:58:34,230 [INFO] [Thread-3] |rm.YarnTaskSchedulerService|: Removing all pending taskRequests
  50. 2017-01-16 15:58:34,230 [INFO] [Thread-3] |history.HistoryEventHandler|: Stopping HistoryEventHandler
  51. 2017-01-16 15:58:34,231 [INFO] [Thread-3] |recovery.RecoveryService|: Stopping RecoveryService
  52. 2017-01-16 15:58:34,231 [INFO] [Thread-3] |recovery.RecoveryService|: Handle the remaining events in queue, queue size=0
  53. 2017-01-16 15:58:34,231 [INFO] [RecoveryEventHandlingThread] |recovery.RecoveryService|: EventQueue take interrupted. Returning
  54. 2017-01-16 15:58:34,231 [INFO] [Thread-3] |recovery.RecoveryService|: Closing Summary Stream
  55. 2017-01-16 15:58:34,249 [INFO] [AMRM Heartbeater thread] |impl.AMRMClientAsyncImpl|: Shutdown requested. Stopping callback.
  56. 2017-01-16 15:58:34,299 [INFO] [Thread-3] |recovery.RecoveryService|: Closing Output Stream for DAG dag_1479141851984_20667_1
  57. 2017-01-16 15:58:34,307 [INFO] [Thread-3] |ats.ATSV15HistoryLoggingService|: Stopping ATSService, eventQueueBacklog=0
  58. 2017-01-16 15:58:34,307 [INFO] [DelayedContainerManager] |rm.YarnTaskSchedulerService|: AllocatedContainerManager Thread interrupted
  59. 2017-01-16 15:58:34,310 [ERROR] [AMRM Callback Handler Thread] |rm.YarnTaskSchedulerService|: Got TaskSchedulerError, org.apache.tez.dag.api.TezUncheckedException: java.lang.InterruptedException
  60. at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper.getProgress(TaskSchedulerAppCallbackWrapper.java:106)
  61. at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getProgress(YarnTaskSchedulerService.java:930)
  62. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:308)
  63. Caused by: java.lang.InterruptedException
  64. at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
  65. at java.util.concurrent.FutureTask.get(FutureTask.java:191)
  66. at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper.getProgress(TaskSchedulerAppCallbackWrapper.java:104)
  67. ... 2 more
  68. 2017-01-16 15:58:34,310 [ERROR] [AMRM Callback Handler Thread] |yarn.YarnUncaughtExceptionHandler|: Thread Thread[AMRM Callback Handler Thread,5,main] threw an Throwable, but we are shutting down, so ignoring this
  69. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.tez.dag.api.TezUncheckedException: java.lang.InterruptedException
  70. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:312)
  71. Caused by: org.apache.tez.dag.api.TezUncheckedException: java.lang.InterruptedException
  72. at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper.getProgress(TaskSchedulerAppCallbackWrapper.java:106)
  73. at org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getProgress(YarnTaskSchedulerService.java:930)
  74. at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:308)
  75. Caused by: java.lang.InterruptedException
  76. at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
  77. at java.util.concurrent.FutureTask.get(FutureTask.java:191)
  78. at org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper.getProgress(TaskSchedulerAppCallbackWrapper.java:104)
  79. ... 2 more