Created on 07-26-2017 09:47 PM - edited 09-16-2022 04:59 AM
Created 07-26-2017 09:49 PM
Created 07-26-2017 11:30 PM
Error initializing SparkContext. org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
The error doesn't up front tells what is wrong but it generally suggests communication/network problems.Spark assigns a random port number to a group of configuration properties that are used for communication between the client and the cluster on a Spark on YARN installation. If any of those random port number assignments fall outside of the range of open ports at the time when the application is submitted, the shell fails. I'd ensure that there are no firewalls (software like iptables or hardware) blocking the port between the driver and the application master.
Also please check the yarn application logs (from the Resource Manager Web UI) to see if there are any hints/errors logged.
Example
Container: container_1487033446961_56149_01_000001 on xxx.com_8041
========================================================================================
LogType:stderr
Log Upload Time:Mon Apr 03 17:57:15 -0400 2017
LogLength:2235
Log Contents:
17/04/03 17:52:55 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT]
17/04/03 17:52:55 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1487033446961_56149_000001
17/04/03 17:52:56 INFO spark.SecurityManager: Changing view acls to: 25761081
17/04/03 17:52:56 INFO spark.SecurityManager: Changing modify acls to: 25761081
17/04/03 17:52:56 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(25761081); users with modify permissions: Set(25761081)
17/04/03 17:52:56 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
17/04/03 17:55:03 ERROR yarn.ApplicationMaster: Failed to connect to driver at 10.xx.xx.243:38634, retrying ...
17/04/03 17:55:03 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Failed to connect to driver! <<<<
at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:484)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:345)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:187)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:653)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:68)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:68)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:651)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:674)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
17/04/03 17:55:03 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!)
17/04/03 17:55:03 INFO util.ShutdownHookManager: Shutdown hook called
Created 07-26-2017 11:53 PM
Created 07-27-2017 09:28 AM
If I understand it correctly - you are able to get past the earlier messages complaining about "Yarn application has already ended!" and now when you try to run pyspark2 it gives you a shell prompt, however, running simple commands to convert a list of strings to all upper case results in containers getting killed with Exit Status:1 .
$ pyspark2 Using Python version 3.6.1 (default, Jul 27 2017 11:07:01) sparkSession available as 'spark'. >>> strings=['old'] >>> s2=sc.parallelize(strings) >>> s3=s2.map(lambda x:x.upper()) >>> s3.collect() [Stage 0:> (0 + 0) / 2]17/07/27 14:52:18 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1501131033996_0005_01_000002 on host: Slave3. Exit status: 1.
To review why the application failed, we need to look at the container logs.
Container logs are available from command line by running yarn logs command
# yarn logs -applicationId <application ID> | less
OR
Cloudera Manager > Yarn > WebUI > Resource Manager WebUI > application_1501131033996_0005 > Check for Logs at the bottom > stderr and stdout