Reply
Expert Contributor
Posts: 229
Registered: ‎01-25-2017

spark error

[ Edited ]

Hi,
 
Anyone familair with this Error in Spark job( this is from the AM Logs)
 
Container exited with a non-zero exit code 1

2017-06-20 10:37:02,785 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Executor lost: 24 (epoch 3)
2017-06-20 10:37:02,784 [sparkDriver-akka.actor.default-dispatcher-43] INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint - Driver terminated or disconnected! Shutting down. svpr-dhc016.lpdomain.com:55642
2017-06-20 10:37:02,785 [sparkDriver-akka.actor.default-dispatcher-36] ERROR org.apache.spark.scheduler.cluster.YarnClusterScheduler - Lost executor 6 on svpr-dhc035.lpdomain.com: Executor heartbeat timed out after 145717 ms
2017-06-20 10:37:02,793 [sparkDriver-akka.actor.default-dispatcher-43] INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint - Driver terminated or disconnected! Shutting down. svpr-dhc016.lpdomain.com:34794
2017-06-20 10:37:02,795 [task-result-getter-0-SendThread(svpr-azk05.lpdomain.com:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server svpr-azk05.lpdomain.com/172.16.147.150:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-20 10:37:02,795 [Reporter] INFO org.apache.spark.deploy.yarn.YarnAllocator - Completed container container_e29_1497968230437_0016_01_000037 (state: COMPLETE, exit status: 1)
2017-06-20 10:37:02,795 [sparkDriver-akka.actor.default-dispatcher-36] INFO org.apache.spark.scheduler.TaskSetManager - Re-queueing tasks for 6 from TaskSet 18.0
2017-06-20 10:37:02,795 [sparkDriver-akka.actor.default-dispatcher-45] INFO org.apache.spark.storage.BlockManagerMasterEndpoint - Trying to remove executor 24 from BlockManagerMaster.
2017-06-20 10:37:02,807 [Reporter] INFO org.apache.spark.deploy.yarn.YarnAllocator - Container marked as failed: container_e29_1497968230437_0016_01_000037. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_e29_1497968230437_0016_01_000037
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)

Announcements