Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎09-27-2017

File not Found Exception - Spark 2

hi,

 

Running a 3 node CDH cluster with 1 master and 2 slaves. I have a web application written in Java that submits spark job to YARN. Getting the below error now . Web App is deployed with Tomcat which runs as different OS user. 

 

Application application_1502437323246_0010 failed 2 times due to AM Container for appattempt_1502437323246_0010_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:, click on links to logs of each attempt.
Diagnostics: File file:/home/user/tomcat/apache-tomcat-8.0.38/temp/spark-1692c53f-313a-41c1-9581-e716c244b7c8/__spark_libs__4041232999285325500.zip does not exist
java.io.FileNotFoundException: File file:/home/user/tomcat/apache-tomcat-8.0.38/temp/spark-1692c53f-313a-41c1-9581-e716c244b7c8/__spark_libs__4041232999285325500.zip does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:598)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:811)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:588)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:425)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
 
Failing this attempt. Failing the application.
 
it looks like the worker nodes do not have access to the above file location and those files should ideally be created on HDFS so that workers can access it.
 
Questions
 
1) what are these files and why are they getting created under the temp folder of tomcat ?
2) Is there configuration that can create these files on HDFS to resolve the above error
3) Any other considerations while running in "Client" deploy mode ?
 
Any other information will be useful as i am new to Spark and HDFS. I am using the default configuration of CDH 5.12 and along with Spark 2.1.0 distribution
Announcements