Support Questions

Find answers, ask questions, and share your expertise

container failed with exit code 143

avatar

i have a 6 DN cluster and every second all the nodemanager is getting down. post the log in

https://community.hortonworks.com/questions/202914/node-manager-is-getting-down-after-few-seconds.ht... and now reducer job is also getting failed.

2018-07-09 06:25:26,262 WARN  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:<init>(182)) - rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished.
2018-07-09 06:25:26,616 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(223)) - Exit code from container container_1531130193317_0195_02_000001 is : 143
2018-07-09 06:25:26,624 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(223)) - Exit code from container container_1531130193317_0197_01_000001 is : 143
2018-07-09 06:25:26,712 WARN  nodemanager.NMAuditLogger (NMAuditLogger.java:logFailure(150)) - USER=dr.whoOPERATION=Container Finished - FailedTARGET=ContainerImplRESULT=FAILUREDESCRIPTION=Container failed with state: EXITED_WITH_FAILUREAPPID=application_1531130193317_0195CONTAINERID=container_1531130193317_0195_02_000001
2018-07-09 06:25:26,819 WARN  nodemanager.NMAuditLogger (NMAuditLogger.java:logFailure(150)) - USER=dr.whoOPERATION=Container Finished - FailedTARGET=ContainerImplRESULT=FAILUREDESCRIPTION=Container failed with state: EXITED_WITH_FAILUREAPPID=application_1531130193317_0197CONTAINERID=container_1531130193317_0197_01_000001
2018-07-09 06:25:30,271 WARN  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:<init>(182)) - rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished.
2018-07-09 06:25:30,534 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(223)) - Exit code from container container_1531130193317_0198_02_000001 is : 143
2018-07-09 06:25:30,600 WARN  nodemanager.NMAuditLogger (NMAuditLogger.java:logFailure(150)) - USER=dr.whoOPERATION=Container Finished - FailedTARGET=ContainerImplRESULT=FAILUREDESCRIPTION=Container failed with state: EXITED_WITH_FAILUREAPPID=application_1531130193317_0198CONTAINERID=container_1531130193317_0198_02_000001
2018-07-09 06:25:31,258 WARN  logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:<init>(182)) - rollingMonitorInterval is set as -1. The log rolling mornitoring interval is disabled. The logs will be aggregated after this application is finished.
2018-07-09 06:25:31,422 WARN  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(223)) - Exit code from container container_1531130193317_0200_02_000001 is : 143
2018-07-09 06:25:31,486 WARN  nodemanager.NMAuditLogger (NMAuditLogger.java:logFailure(150)) - USER=dr.whoOPERATION=Container Finished - FailedTARGET=ContainerImplRESULT=FAILUREDESCRIPTION=Container failed with state: EXITED_WITH_FAILUREAPPID=application_1531130193317_0200CONTAINERID=container_1531130193317_0200_02_000001
1 REPLY 1

avatar

Hi @Punit kumar!

AFAIK usually 143 error code it's related to memory/GC issues.
Could you enable the DEBUG mode to Yarn logs?
Also, share with us what kinda job are you running and your app,map,reduce memory properties (the opts as well). And the nodemanager resources too, plz!
Thanks.