Created on 01-28-2017 06:03 AM - edited 09-16-2022 03:58 AM
Map jobs are failing with exit below error:
Timed out after 600 secs Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
Created 01-28-2017 12:16 PM
Created 01-28-2017 07:46 PM
Created 01-29-2017 09:05 PM
Created on 10-20-2017 01:39 PM - edited 10-20-2017 01:40 PM
I am facing a similar issue and after looking at the jon history server i see that last mapper has failed.
The logs from that map task is just this.
2017-10-13 12:15:24,376 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e25_1505390873369_5614_01_000002 2017-10-13 12:15:24,376 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:27 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:15:24,376 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1505390873369_5614_m_000000_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 2017-10-13 12:18:17,290 INFO [IPC Server handler 4 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1505390873369_5614_m_000010_0 is : 1.0 2017-10-13 12:18:17,385 INFO [IPC Server handler 3 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt attempt_1505390873369_5614_m_000010_0 is : 1.0 2017-10-13 12:18:17,388 INFO [IPC Server handler 11 on 55492] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1505390873369_5614_m_000010_0 2017-10-13 12:18:17,388 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from RUNNING to SUCCESS_FINISHING_CONTAINER 2017-10-13 12:18:17,388 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1505390873369_5614_m_000010_0 2017-10-13 12:18:17,389 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1505390873369_5614_m_000010 Task Transitioned from RUNNING to SUCCEEDED 2017-10-13 12:18:17,389 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 28 2017-10-13 12:18:17,667 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:28 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:19:23,003 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: Expired:attempt_1505390873369_5614_m_000010_0 Timed out after 60 secs 2017-10-13 12:19:23,003 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Task attempt attempt_1505390873369_5614_m_000010_0 is done from TaskUmbilicalProtocol's point of view. However, it stays in finishing state for too long 2017-10-13 12:19:23,003 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCESS_CONTAINER_CLEANUP 2017-10-13 12:19:23,003 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_e25_1505390873369_5614_01_000012 taskAttempt attempt_1505390873369_5614_m_000010_0 2017-10-13 12:19:23,003 INFO [ContainerLauncher #9] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1505390873369_5614_m_000010_0 2017-10-13 12:19:23,010 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1505390873369_5614_m_000010_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED 2017-10-13 12:19:23,775 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received completed container container_e25_1505390873369_5614_01_000012 2017-10-13 12:19:23,775 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:28 CompletedReds:0 ContAlloc:33 ContRel:4 HostLocal:0 RackLocal:0 2017-10-13 12:19:23,775 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1505390873369_5614_m_000010_0: Container killed by the ApplicationMaster. Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143
Created 06-06-2019 11:28 PM
yarn logs -applicationId <application master ID> should help. It occurs typically due improper container memory allocation and physical memory availability on the cluster.
Created 01-28-2017 02:58 PM
The "Timed out after 600 secs Container killed by the ApplicationMaster" message indicates that the application master did not see any progress in the Task for 10 minutes (default timeout) so the Application Master killed it.
The question is what was the task doing so that no progress was detected.
I'd recommend looking at the application logs for clues about what the task was doing when it was killed.
Use the Resource Manager UI or command line like this to get the logs:
yarn logs -applicationId <application ID> <options>
Regards,
Ben