Support Questions

Find answers, ask questions, and share your expertise

I submit a Spark task in YARN mode, but the message Container exited with a non-zero exit code 13. Error file: Prelaunch. Err. The Last 4096 bytes of prelaunch. Err: the Last 4096 bytes of stderr: ... Java. IO. FileNotFoundException: File File:"

avatar
New Contributor

The same operation worked normally in the previous months, but this problem suddenly appeared recently, and the cause could not be found.

 

I hope you can help me find out the cause of the problem, thank you!

 

The error log is as follows :

[2022-06-16 13:12:10,571] {ssh.py:141} INFO - Warning: Ignoring non-spark config property: 2022-06-16=21:04:35,281 DEBUG UserGroupInformation:1902 main 281 - PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.hdfs.tools.GetConf.run(GetConf.java:315) [2022-06-16 13:12:10,572] {ssh.py:141} INFO - Warning: Ignoring non-spark config property: hdfs=//bd.vn0038.jmrh.com:8020/user/spark/applicationHistory [2022-06-16 13:12:12,237] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO client.RMProxy: Connecting to ResourceManager at bd.vn0038.jmrh.com/172.168.100.171:8032 [2022-06-16 13:12:12,557] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Requesting a new application from cluster with 7 NodeManagers [2022-06-16 13:12:12,698] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO conf.Configuration: resource-types.xml not found [2022-06-16 13:12:12,699] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'. [2022-06-16 13:12:12,716] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (52559 MB per container) [2022-06-16 13:12:12,717] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead [2022-06-16 13:12:12,717] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Setting up container launch context for our AM [2022-06-16 13:12:12,720] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Setting up the launch environment for our AM container [2022-06-16 13:12:12,733] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Preparing resources for our AM container [2022-06-16 13:12:12,945] {ssh.py:141} INFO - 22/06/16 21:12:12 INFO yarn.Client: Uploading resource file:/opt/project/deltaentropy/com.deltaentropy.bigdata.jar -> hdfs://bd.vn0038.jmrh.com:8020/user/hdfs/.sparkStaging/application_1655384960297_0001/com.deltaentropy.bigdata.jar [2022-06-16 13:12:13,733] {ssh.py:141} INFO - 22/06/16 21:12:13 INFO yarn.Client: Uploading resource file:/tmp/spark-3d70c7cf-0a07-43f7-9b2e-1c544857e399/__spark_conf__1574342015450416152.zip -> hdfs://bd.vn0038.jmrh.com:8020/user/hdfs/.sparkStaging/application_1655384960297_0001/__spark_conf__.zip [2022-06-16 13:12:14,110] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO spark.SecurityManager: Changing view acls to: hdfs [2022-06-16 13:12:14,111] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO spark.SecurityManager: Changing modify acls to: hdfs [2022-06-16 13:12:14,112] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO spark.SecurityManager: Changing view acls groups to: [2022-06-16 13:12:14,112] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO spark.SecurityManager: Changing modify acls groups to: [2022-06-16 13:12:14,113] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs); groups with view permissions: Set(); users with modify permissions: Set(hdfs); groups with modify permissions: Set() [2022-06-16 13:12:14,144] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.cloudera.hive/hive-site.xml [2022-06-16 13:12:14,278] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO yarn.Client: Submitting application application_1655384960297_0001 to ResourceManager [2022-06-16 13:12:14,720] {ssh.py:141} INFO - 22/06/16 21:12:14 INFO impl.YarnClientImpl: Submitted application application_1655384960297_0001 [2022-06-16 13:12:15,726] {ssh.py:141} INFO - 22/06/16 21:12:15 INFO yarn.Client: Application report for application_1655384960297_0001 (state: ACCEPTED) [2022-06-16 13:12:15,730] {ssh.py:141} INFO - 22/06/16 21:12:15 INFO yarn.Client: client token: N/A [2022-06-16 13:12:15,731] {ssh.py:141} INFO - diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 [2022-06-16 13:12:15,731] {ssh.py:141} INFO - queue: root.users.hdfs start time: 1655385134427 final status: UNDEFINED tracking URL: http://bd.vn0038.jmrh.com:8088/proxy/application_1655384960297_0001/ user: hdfs [2022-06-16 13:12:16,736] {ssh.py:141} INFO - 22/06/16 21:12:16 INFO yarn.Client: Application report for application_1655384960297_0001 (state: ACCEPTED) [2022-06-16 13:12:17,740] {ssh.py:141} INFO - 22/06/16 21:12:17 INFO yarn.Client: Application report for application_1655384960297_0001 (state: ACCEPTED) [2022-06-16 13:12:18,745] {ssh.py:141} INFO - 22/06/16 21:12:18 INFO yarn.Client: Application report for application_1655384960297_0001 (state: ACCEPTED) [2022-06-16 13:12:19,749] {ssh.py:141} INFO - 22/06/16 21:12:19 INFO yarn.Client: Application report for application_1655384960297_0001 (state: ACCEPTED) [2022-06-16 13:12:20,758] {ssh.py:141} INFO - 22/06/16 21:12:20 INFO yarn.Client: Application report for application_1655384960297_0001 (state: FAILED) 22/06/16 21:12:20 INFO yarn.Client: client token: N/A diagnostics: Application application_1655384960297_0001 failed 1 times (global limit =2; local limit is =1) due to AM Container for appattempt_1655384960297_0001_000001 exited with exitCode: 13 Failing this attempt.Diagnostics: [2022-06-16 21:12:19.730]Exception from container-launch. Container id: container_1655384960297_0001_01_000001 Exit code: 13 [2022-06-16 21:12:19.757]Container exited with a non-zero exit code 13. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : .FilterFileSystem.getFileStatus(FilterFileSystem.java:442) at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:103) at org.apache.spark.SparkContext.<init>(SparkContext.scala:533) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2549) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:944) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:935) at myspark.warehouse.DriveEvent$.main(DriveEvent.scala:99) at myspark.warehouse.DriveEvent.main(DriveEvent.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673) ) 22/06/16 21:12:18 ERROR yarn.ApplicationMaster: Uncaught exception: org.apache.spark.SparkException: Exception thrown in awaitResult: at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:447) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:275) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:805) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:804) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:804) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) Caused by: java.io.FileNotFoundException: File file:/var/hadoop/yarn/nm/usercache/hdfs/appcache/application_1655384960297_0001/container_1655384960297_0001_01_000001/2022-06-16 21:04:35,000 DEBUG Shell:822 main 0 - setsid exited with exit code 0 does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442) at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:103) at org.apache.spark.SparkContext.<init>(SparkContext.scala:533) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2549) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:944) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:935) at myspark.warehouse.DriveEvent$.main(DriveEvent.scala:99) at myspark.warehouse.DriveEvent.main(DriveEvent.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Metho [2022-06-16 13:12:20,760] {ssh.py:141} INFO - d) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673) 22/06/16 21:12:18 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://bd.vn0038.jmrh.com:8020/user/hdfs/.sparkStaging/application_1655384960297_0001 22/06/16 21:12:19 INFO util.ShutdownHookManager: Shutdown hook called 22/06/16 21:12:19 INFO util.ShutdownHookManager: Deleting directory /var/hadoop/yarn/nm/usercache/hdfs/appcache/application_1655384960297_0001/spark-7ef523d9-b756-4971-9969-bfa9f4b828eb

2 REPLIES 2

avatar
New Contributor

I have checked the local-dirs setting path of YARN. The path is normal. Data can be written and user permissions are normal

avatar
Master Collaborator

Hi @AZIMKBC 

 

Please try to run the SparkPi example and see if is there any error in the logs. 

 

https://rangareddy.github.io/SparkPiExample/

 

If still issue is not resolved and you are a Cloudera customer please raise a case we will work on internally.