Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

spark error

avatar
Master Collaborator

Hi,
 
Anyone familair with this Error in Spark job( this is from the AM Logs)
 
Container exited with a non-zero exit code 1

2017-06-20 10:37:02,785 [dag-scheduler-event-loop] INFO org.apache.spark.scheduler.DAGScheduler - Executor lost: 24 (epoch 3)
2017-06-20 10:37:02,784 [sparkDriver-akka.actor.default-dispatcher-43] INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint - Driver terminated or disconnected! Shutting down. svpr-dhc016.lpdomain.com:55642
2017-06-20 10:37:02,785 [sparkDriver-akka.actor.default-dispatcher-36] ERROR org.apache.spark.scheduler.cluster.YarnClusterScheduler - Lost executor 6 on svpr-dhc035.lpdomain.com: Executor heartbeat timed out after 145717 ms
2017-06-20 10:37:02,793 [sparkDriver-akka.actor.default-dispatcher-43] INFO org.apache.spark.deploy.yarn.ApplicationMaster$AMEndpoint - Driver terminated or disconnected! Shutting down. svpr-dhc016.lpdomain.com:34794
2017-06-20 10:37:02,795 [task-result-getter-0-SendThread(svpr-azk05.lpdomain.com:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server svpr-azk05.lpdomain.com/172.16.147.150:2181. Will not attempt to authenticate using SASL (unknown error)
2017-06-20 10:37:02,795 [Reporter] INFO org.apache.spark.deploy.yarn.YarnAllocator - Completed container container_e29_1497968230437_0016_01_000037 (state: COMPLETE, exit status: 1)
2017-06-20 10:37:02,795 [sparkDriver-akka.actor.default-dispatcher-36] INFO org.apache.spark.scheduler.TaskSetManager - Re-queueing tasks for 6 from TaskSet 18.0
2017-06-20 10:37:02,795 [sparkDriver-akka.actor.default-dispatcher-45] INFO org.apache.spark.storage.BlockManagerMasterEndpoint - Trying to remove executor 24 from BlockManagerMaster.
2017-06-20 10:37:02,807 [Reporter] INFO org.apache.spark.deploy.yarn.YarnAllocator - Container marked as failed: container_e29_1497968230437_0016_01_000037. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_e29_1497968230437_0016_01_000037
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
at org.apache.hadoop.util.Shell.run(Shell.java:478)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)

1 REPLY 1

avatar
Cloudera Employee

Hi,

 

This Error will happen when "Spark executor-memory is too small for spark to start" . Please refer to the upstream jira for more details.

 

https://issues.apache.org/jira/browse/SPARK-12759

 

Thanks

AKR