Created on 06-03-2017 08:11 AM - edited 09-16-2022 04:41 AM
Hello Team,
My cloudera cluster information:
Name node with HA enabled
Resource manager with HA enabled
Mapreduce Framework 1 was installed before and it was removed.
3 Node managers and 3 Data nodes
Hive service is installed
Cluster is being managed from Cloudera manager
Version 5.11
I cant be able to run mapreduce job from hive.
hive> INSERT INTO TABLE students VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32); Query ID = labuser_20170603150101_eaca6901-5d5f-4c40-8751-2576f7349396 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_1496499143480_0003, Tracking URL = http://ip-10-0-21-98.ec2.internal:8088/proxy/application_1496499143480_0003/ Kill Command = /opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/hadoop/bin/hadoop job -kill job_1496499143480_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2017-06-03 15:01:52,051 Stage-1 map = 0%, reduce = 0% Ended Job = job_1496499143480_0003 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec hive>
Basic Configuation Information about cluster:
Maximum application attempt and mapreduce attempt is set to 2.
Nodemanager memory alloted: 6GB per node
Nodemanager core alloted: 2 vcore
Cluster capacity : 6Vcore and 18GB RAM
Container size min and max : 1vcore 512MB and 2vcore 2GB
Mapper memory min and max : 512MB
Reducer memory min and max : 1GB
Heap size for individual deamons : 1GB
Incremental memory and core for container as well as mapred : 512MB and 1Vcore
Resource manager logs:
2017-06-03 15:01:39,597 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 3 2017-06-03 15:01:40,719 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Application with id 3 submitted by user labuser 2017-06-03 15:01:40,719 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Storing application with id application_1496499143480_0003 2017-06-03 15:01:40,720 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1496499143480_0003 State change from NEW to NEW_SAVING on event = START 2017-06-03 15:01:40,720 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1496499143480_0003 2017-06-03 15:01:40,719 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser IP=10.0.20.51 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1496499143480_0003 2017-06-03 15:01:40,729 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1496499143480_0003 State change from NEW_SAVING to SUBMITTED on event = APP_NEW_SAVED 2017-06-03 15:01:40,730 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Accepted application application_1496499143480_0003 from user: labuser, in queue: root.users.labuser, currently num of applications: 1 2017-06-03 15:01:40,730 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1496499143480_0003 State change from SUBMITTED to ACCEPTED on event = APP_ACCEPTED 2017-06-03 15:01:40,730 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1496499143480_0003_000001 2017-06-03 15:01:40,731 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from NEW to SUBMITTED on event = START 2017-06-03 15:01:40,731 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1496499143480_0003_000001 to scheduler from user: labuser 2017-06-03 15:01:40,731 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from SUBMITTED to SCHEDULED on event = ATTEMPT_ADDED 2017-06-03 15:01:41,518 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_01_000001 Container Transitioned from NEW to ALLOCATED 2017-06-03 15:01:41,518 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:41,518 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e02_1496499143480_0003_01_000001 of capacity <memory:1024, vCores:1> on host ip-10-0-21-245.ec2.internal:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:5120, vCores:1> available after allocation 2017-06-03 15:01:41,519 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : ip-10-0-21-245.ec2.internal:8041 for container : container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:41,519 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_01_000001 Container Transitioned from ALLOCATED to ACQUIRED 2017-06-03 15:01:41,519 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1496499143480_0003_000001 2017-06-03 15:01:41,519 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1496499143480_0003 AttemptId: appattempt_1496499143480_0003_000001 MasterContainer: Container: [ContainerId: container_e02_1496499143480_0003_01_000001, NodeId: ip-10-0-21-245.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-245.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.245:8041 }, ] 2017-06-03 15:01:41,520 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from SCHEDULED to ALLOCATED_SAVING on event = CONTAINER_ALLOCATED 2017-06-03 15:01:41,523 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from ALLOCATED_SAVING to ALLOCATED on event = ATTEMPT_NEW_SAVED 2017-06-03 15:01:41,524 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1496499143480_0003_000001 2017-06-03 15:01:41,526 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_e02_1496499143480_0003_01_000001, NodeId: ip-10-0-21-245.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-245.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.245:8041 }, ] for AM appattempt_1496499143480_0003_000001 2017-06-03 15:01:41,526 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1496499143480_0003_000001 2017-06-03 15:01:41,526 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1496499143480_0003_000001 2017-06-03 15:01:41,535 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_e02_1496499143480_0003_01_000001, NodeId: ip-10-0-21-245.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-245.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.245:8041 }, ] for AM appattempt_1496499143480_0003_000001 2017-06-03 15:01:41,535 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from ALLOCATED to LAUNCHED on event = LAUNCHED 2017-06-03 15:01:42,520 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_01_000001 Container Transitioned from ACQUIRED to RUNNING 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_01_000001 Container Transitioned from RUNNING to COMPLETED 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_e02_1496499143480_0003_01_000001 in state: COMPLETED event:FINISHED 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_e02_1496499143480_0003_01_000001 of capacity <memory:1024, vCores:1> on host ip-10-0-21-245.ec2.internal:8041, which currently has 0 containers, <memory:0, vCores:0> used and <memory:6144, vCores:2> available, release resources=true 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1496499143480_0003_000001 with final state: FAILED, and exit status: 1 2017-06-03 15:01:45,872 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1496499143480_0003_000001 released container container_e02_1496499143480_0003_01_000001 on node: host: ip-10-0-21-245.ec2.internal:8041 #containers=0 available=6144 used=0 with event: FINISHED 2017-06-03 15:01:45,873 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from LAUNCHED to FINAL_SAVING on event = CONTAINER_FINISHED 2017-06-03 15:01:45,878 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1496499143480_0003_000001 2017-06-03 15:01:45,878 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1496499143480_0003_000001 2017-06-03 15:01:45,878 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000001 State change from FINAL_SAVING to FAILED on event = ATTEMPT_UPDATE_SAVED 2017-06-03 15:01:45,878 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2 2017-06-03 15:01:45,879 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1496499143480_0003_000001 is done. finalState=FAILED 2017-06-03 15:01:45,879 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1496499143480_0003_000002 2017-06-03 15:01:45,879 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1496499143480_0003 requests cleared 2017-06-03 15:01:45,879 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from NEW to SUBMITTED on event = START 2017-06-03 15:01:45,879 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Added Application Attempt appattempt_1496499143480_0003_000002 to scheduler from user: labuser 2017-06-03 15:01:45,880 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from SUBMITTED to SCHEDULED on event = ATTEMPT_ADDED 2017-06-03 15:01:46,764 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_02_000001 Container Transitioned from NEW to ALLOCATED 2017-06-03 15:01:46,764 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser OPERATION=AM Allocated Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:46,764 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_e02_1496499143480_0003_02_000001 of capacity <memory:1024, vCores:1> on host ip-10-0-21-96.ec2.internal:8041, which has 1 containers, <memory:1024, vCores:1> used and <memory:5120, vCores:1> available after allocation 2017-06-03 15:01:46,765 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Sending NMToken for nodeId : ip-10-0-21-96.ec2.internal:8041 for container : container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:46,765 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_02_000001 Container Transitioned from ALLOCATED to ACQUIRED 2017-06-03 15:01:46,765 INFO org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM: Clear node set for appattempt_1496499143480_0003_000002 2017-06-03 15:01:46,765 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Storing attempt: AppId: application_1496499143480_0003 AttemptId: appattempt_1496499143480_0003_000002 MasterContainer: Container: [ContainerId: container_e02_1496499143480_0003_02_000001, NodeId: ip-10-0-21-96.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-96.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.96:8041 }, ] 2017-06-03 15:01:46,766 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from SCHEDULED to ALLOCATED_SAVING on event = CONTAINER_ALLOCATED 2017-06-03 15:01:46,770 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from ALLOCATED_SAVING to ALLOCATED on event = ATTEMPT_NEW_SAVED 2017-06-03 15:01:46,771 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1496499143480_0003_000002 2017-06-03 15:01:46,773 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_e02_1496499143480_0003_02_000001, NodeId: ip-10-0-21-96.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-96.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.96:8041 }, ] for AM appattempt_1496499143480_0003_000002 2017-06-03 15:01:46,773 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Create AMRMToken for ApplicationAttempt: appattempt_1496499143480_0003_000002 2017-06-03 15:01:46,773 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Creating password for appattempt_1496499143480_0003_000002 2017-06-03 15:01:46,782 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_e02_1496499143480_0003_02_000001, NodeId: ip-10-0-21-96.ec2.internal:8041, NodeHttpAddress: ip-10-0-21-96.ec2.internal:8042, Resource: <memory:1024, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 10.0.21.96:8041 }, ] for AM appattempt_1496499143480_0003_000002 2017-06-03 15:01:46,782 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from ALLOCATED to LAUNCHED on event = LAUNCHED 2017-06-03 15:01:47,768 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_02_000001 Container Transitioned from ACQUIRED to RUNNING 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_e02_1496499143480_0003_02_000001 Container Transitioned from RUNNING to COMPLETED 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: Completed container: container_e02_1496499143480_0003_02_000001 in state: COMPLETED event:FINISHED 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_e02_1496499143480_0003_02_000001 of capacity <memory:1024, vCores:1> on host ip-10-0-21-96.ec2.internal:8041, which currently has 0 containers, <memory:0, vCores:0> used and <memory:6144, vCores:2> available, release resources=true 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1496499143480_0003_000002 with final state: FAILED, and exit status: 1 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application attempt appattempt_1496499143480_0003_000002 released container container_e02_1496499143480_0003_02_000001 on node: host: ip-10-0-21-96.ec2.internal:8041 #containers=0 available=6144 used=0 with event: FINISHED 2017-06-03 15:01:51,243 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from LAUNCHED to FINAL_SAVING on event = CONTAINER_FINISHED 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1496499143480_0003_000002 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1496499143480_0003_000002 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1496499143480_0003_000002 State change from FINAL_SAVING to FAILED on event = ATTEMPT_UPDATE_SAVED 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1496499143480_0003 with final state: FAILED 2017-06-03 15:01:51,249 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1496499143480_0003 2017-06-03 15:01:51,250 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1496499143480_0003 State change from ACCEPTED to FINAL_SAVING on event = ATTEMPT_FAILED 2017-06-03 15:01:51,250 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Application appattempt_1496499143480_0003_000002 is done. finalState=FAILED 2017-06-03 15:01:51,250 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1496499143480_0003 requests cleared 2017-06-03 15:01:51,293 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1496499143480_0003 failed 2 times due to AM Container for appattempt_1496499143480_0003_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://ip-10-0-21-98.ec2.internal:8088/proxy/application_1496499143480_0003/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e02_1496499143480_0003_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. 2017-06-03 15:01:51,293 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1496499143480_0003 State change from FINAL_SAVING to FAILED on event = APP_UPDATE_SAVED 2017-06-03 15:01:51,293 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1496499143480_0003 failed 2 times due to AM Container for appattempt_1496499143480_0003_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://ip-10-0-21-98.ec2.internal:8088/proxy/application_1496499143480_0003/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e02_1496499143480_0003_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. APPID=application_1496499143480_0003 2017-06-03 15:01:51,294 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1496499143480_0003,name=INSERT INTO TABLE students VALUES ('...2.32)(Stage-1),user=labuser,queue=root.users.labuser,state=FAILED,trackingUrl=http://ip-10-0-21-98.ec2.internal:8088/cluster/app/application_1496499143480_0003,appMasterHost=N/A,startTime=1496502100719,finishTime=1496502111249,finalStatus=FAILED 2017-06-03 15:01:52,077 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=labuser IP=10.0.20.51 OPERATION=Kill Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1496499143480_0003
Node manager logs:
Node1
2017-06-03 15:01:41,531 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_e02_1496499143480_0003_01_000001 by user labuser 2017-06-03 15:01:41,531 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1496499143480_0003 2017-06-03 15:01:41,531 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from NEW to INITING 2017-06-03 15:01:41,532 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=labuser IP=10.0.21.98 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:41,541 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: rollingMonitorInterval is set as -1. The log rolling monitoring interval is disabled. The logs will be aggregated after this application is finished. 2017-06-03 15:01:41,567 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_e02_1496499143480_0003_01_000001 to application application_1496499143480_0003 2017-06-03 15:01:41,568 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from INITING to RUNNING 2017-06-03 15:01:41,568 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_01_000001 transitioned from NEW to LOCALIZING 2017-06-03 15:01:41,569 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1496499143480_0003 2017-06-03 15:01:41,569 INFO org.apache.spark.network.yarn.YarnShuffleService: Initializing container container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:41,570 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:41,571 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /yarn/nm/nmPrivate/container_e02_1496499143480_0003_01_000001.tokens. Credentials list: 2017-06-03 15:01:41,571 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user labuser 2017-06-03 15:01:41,573 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /yarn/nm/nmPrivate/container_e02_1496499143480_0003_01_000001.tokens to /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001.tokens 2017-06-03 15:01:41,573 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 = file:/yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 2017-06-03 15:01:42,111 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_01_000001 transitioned from LOCALIZING to LOCALIZED 2017-06-03 15:01:42,132 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_01_000001 transitioned from LOCALIZED to RUNNING 2017-06-03 15:01:42,136 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001/default_container_executor.sh] 2017-06-03 15:01:44,297 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:44,354 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 13223 for container-id container_e02_1496499143480_0003_01_000001: 114.7 MB of 1 GB physical memory used; 1.4 GB of 2.1 GB virtual memory used 2017-06-03 15:01:45,825 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_e02_1496499143480_0003_01_000001 is : 1 2017-06-03 15:01:45,825 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_e02_1496499143480_0003_01_000001 and exit code: 1 ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch. 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 1 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=1: 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:504) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:262) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2017-06-03 15:01:45,827 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:745) 2017-06-03 15:01:45,828 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1 2017-06-03 15:01:45,828 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_01_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 2017-06-03 15:01:45,828 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:45,860 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=labuser OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:45,860 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_01_000001 transitioned from EXITED_WITH_FAILURE to DONE 2017-06-03 15:01:45,860 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_e02_1496499143480_0003_01_000001 from application application_1496499143480_0003 2017-06-03 15:01:45,860 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_e02_1496499143480_0003_01_000001 for log-aggregation 2017-06-03 15:01:45,860 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1496499143480_0003 2017-06-03 15:01:45,860 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping container container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:45,860 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:46,871 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_e02_1496499143480_0003_01_000001] 2017-06-03 15:01:47,355 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_e02_1496499143480_0003_01_000001 2017-06-03 15:01:51,879 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2017-06-03 15:01:51,880 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 2017-06-03 15:01:51,880 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1496499143480_0003 2017-06-03 15:01:51,880 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping application application_1496499143480_0003 2017-06-03 15:01:51,880 INFO org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: Application application_1496499143480_0003 removed, cleanupLocalDirs = false 2017-06-03 15:01:51,881 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED 2017-06-03 15:01:51,881 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1496499143480_0003 2017-06-03 15:01:51,897 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_e02_1496499143480_0003_01_000001. Current good log dirs are /yarn/container-logs 2017-06-03 15:01:51,897 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001/stdout 2017-06-03 15:01:51,898 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001/stderr 2017-06-03 15:01:51,898 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_01_000001/syslog 2017-06-03 15:01:51,932 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003
Node2:
2017-06-03 15:01:46,776 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_e02_1496499143480_0003_02_000001 by user labuser 2017-06-03 15:01:46,776 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Creating a new application reference for app application_1496499143480_0003 2017-06-03 15:01:46,777 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from NEW to INITING 2017-06-03 15:01:46,777 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=labuser IP=10.0.21.98 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:46,780 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: rollingMonitorInterval is set as -1. The log rolling monitoring interval is disabled. The logs will be aggregated after this application is finished. 2017-06-03 15:01:46,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Adding container_e02_1496499143480_0003_02_000001 to application application_1496499143480_0003 2017-06-03 15:01:46,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from INITING to RUNNING 2017-06-03 15:01:46,793 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_02_000001 transitioned from NEW to LOCALIZING 2017-06-03 15:01:46,794 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_INIT for appId application_1496499143480_0003 2017-06-03 15:01:46,794 INFO org.apache.spark.network.yarn.YarnShuffleService: Initializing container container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:46,795 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Created localizer for container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:46,797 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Writing credentials to the nmPrivate file /yarn/nm/nmPrivate/container_e02_1496499143480_0003_02_000001.tokens. Credentials list: 2017-06-03 15:01:46,798 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user labuser 2017-06-03 15:01:46,799 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /yarn/nm/nmPrivate/container_e02_1496499143480_0003_02_000001.tokens to /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001.tokens 2017-06-03 15:01:46,799 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 = file:/yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 2017-06-03 15:01:47,353 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_02_000001 transitioned from LOCALIZING to LOCALIZED 2017-06-03 15:01:47,374 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_02_000001 transitioned from LOCALIZED to RUNNING 2017-06-03 15:01:47,378 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: launchContainer: [bash, /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001/default_container_executor.sh] 2017-06-03 15:01:47,425 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Starting resource-monitoring for container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:47,455 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 21272 for container-id container_e02_1496499143480_0003_02_000001: 18.9 MB of 1 GB physical memory used; 1.4 GB of 2.1 GB virtual memory used 2017-06-03 15:01:50,491 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Memory usage of ProcessTree 21272 for container-id container_e02_1496499143480_0003_02_000001: 143.6 MB of 1 GB physical memory used; 1.4 GB of 2.1 GB virtual memory used 2017-06-03 15:01:51,213 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_e02_1496499143480_0003_02_000001 is : 1 2017-06-03 15:01:51,213 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_e02_1496499143480_0003_02_000001 and exit code: 1 ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch. 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 1 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=1: 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:504) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:262) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2017-06-03 15:01:51,214 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:745) 2017-06-03 15:01:51,215 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 1 2017-06-03 15:01:51,215 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_02_000001 transitioned from RUNNING to EXITED_WITH_FAILURE 2017-06-03 15:01:51,215 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:51,235 WARN org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=labuser OPERATION=Container Finished - Failed TARGET=ContainerImpl RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE APPID=application_1496499143480_0003 CONTAINERID=container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:51,236 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_e02_1496499143480_0003_02_000001 transitioned from EXITED_WITH_FAILURE to DONE 2017-06-03 15:01:51,236 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Removing container_e02_1496499143480_0003_02_000001 from application application_1496499143480_0003 2017-06-03 15:01:51,236 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Considering container container_e02_1496499143480_0003_02_000001 for log-aggregation 2017-06-03 15:01:51,236 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event CONTAINER_STOP for appId application_1496499143480_0003 2017-06-03 15:01:51,236 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:51,236 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping container container_e02_1496499143480_0003_02_000001 2017-06-03 15:01:52,241 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed completed containers from NM context: [container_e02_1496499143480_0003_02_000001] 2017-06-03 15:01:52,241 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from RUNNING to APPLICATION_RESOURCES_CLEANINGUP 2017-06-03 15:01:52,242 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting absolute path : /yarn/nm/usercache/labuser/appcache/application_1496499143480_0003 2017-06-03 15:01:52,242 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Got event APPLICATION_STOP for appId application_1496499143480_0003 2017-06-03 15:01:52,242 INFO org.apache.spark.network.yarn.YarnShuffleService: Stopping application application_1496499143480_0003 2017-06-03 15:01:52,242 INFO org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: Application application_1496499143480_0003 removed, cleanupLocalDirs = false 2017-06-03 15:01:52,242 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.Application: Application application_1496499143480_0003 transitioned from APPLICATION_RESOURCES_CLEANINGUP to FINISHED 2017-06-03 15:01:52,242 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Application just finished : application_1496499143480_0003 2017-06-03 15:01:52,340 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl: Uploading logs for container container_e02_1496499143480_0003_02_000001. Current good log dirs are /yarn/container-logs 2017-06-03 15:01:52,341 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001/stderr 2017-06-03 15:01:52,342 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001/syslog 2017-06-03 15:01:52,343 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003/container_e02_1496499143480_0003_02_000001/stdout 2017-06-03 15:01:52,372 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Deleting path : /yarn/container-logs/application_1496499143480_0003 2017-06-03 15:01:53,491 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Stopping resource-monitoring for container_e02_1496499143480_0003_02_000001
Created 02-15-2018 09:41 AM
I have fixed the issue. The problem was due to problem in loading hadoop native library after pointing a env variable which refers hadoop native library, I can run MR jobs in my cluster.
Created 06-06-2017 05:38 AM
The logs of the YARN services (RM, NM) are irrelevant. You must inspect the logs of the YARN job in HistoryServer.
Note that job_1496499143480_0003 uses the legacy naming convention (pre-YARN); the actual YARN job ID is application_1496499143480_0003
Option 1: open the YARN UI, and inspect the dashboard of recent jobs for that ID
Option 2: use the CLI i.e.
yarn application -status application_1496499143480_0003
yarn logs -applicationId application_1496499143480_0003 | more
Created 06-06-2017 05:45 AM
Hello,
When I inspected the application master attempts and its container logs: \
My log message:
ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
The reason I displayed the rm and nm log was just telling you that container has allocated for running the job and tries to execute the application on it until retry limit has reached.
Created 02-15-2018 09:41 AM
I have fixed the issue. The problem was due to problem in loading hadoop native library after pointing a env variable which refers hadoop native library, I can run MR jobs in my cluster.
Created 03-12-2018 02:24 AM
I facing same problem, can please tell which config to edit to point native library.
@rkkrishnaawrote:I have fixed the issue. The problem was due to problem in loading hadoop native library after pointing a env variable which refers hadoop native library, I can run MR jobs in my cluster.
Created 03-08-2019 03:10 AM
Created 03-08-2019 03:09 AM