Created 04-19-2018 09:57 AM
I am running a few spark jobs which are scheduled in oozie workflow, one of the job is failing with below error
[main] INFO org.apache.spark.deploy.yarn.Client - client token: N/A diagnostics: Application application_1523897345683_2170 failed 2 times due to AM Container for appattempt_1523897345683_2170_000004 exited with exitCode: 1 For more detailed output, check the application tracking page: http://<master_ip>:8088/cluster/app/application_1523897345683_2170 Then click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_e13_1523897345683_2170_04_000001 Exit code: 1 Exception in thread "main" java.io.FileNotFoundException: File does not exist: hdfs://<master_ip>:8020/user/hdfs/.sparkStaging/application_1523897345683_2170/__spark_conf__.zip at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1446) at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$7.apply(ApplicationMaster.scala:177) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$7.apply(ApplicationMaster.scala:174) at scala.Option.foreach(Option.scala:257) at org.apache.spark.deploy.yarn.ApplicationMaster.<init>(ApplicationMaster.scala:174) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:767) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:67) at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:66) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:766) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: queue_one start time: 1524129738475 final status: FAILED tracking URL: http://<master_ip>:8088/cluster/app/application_1523897345683_2170 user: hdfs <<< Invocation of Spark command completed <<< Hadoop Job IDs executed by Spark: job_1523897345683_2170
Could you please help on this.
Thank you.
Regards
Sampath
Created 04-19-2018 10:04 AM
It might happen if the oozie shared library is not updated. Can you please try recreating the oozie sharelib and then test again.
Created 05-21-2018 05:34 AM
@Sampath Kumar DId you found the root cause and solution for this problem. I am facing the same issue.
Please post the solution, it will help others.
Thanks,
Fairoz
Created 06-13-2018 03:17 PM
@Sampath Kumar, @SHAIKH FAIROZ AHMED, Did u find the answer? i'm facing the same issue.
,@Sampath Kumar, @SHAIKH FAIROZ AHMED, did you find the solutions to this problem? i am facing it too.
Created 08-16-2018 01:52 PM
@Sampath Kumar, @SHAIKH FAIROZ AHMED, @Jack Marquez, did you find the solution to this problem? I am facing it too.