Member since
03-22-2019
2
Posts
0
Kudos Received
0
Solutions
03-23-2019
07:50 PM
Can someone help me understand where does the following method look for the file in the yarn cluster mode connecting to EMR? org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus As per the code, the method is throwing java.io.FileNotFoundException: File file:/spark_stage/.sparkStaging/application_*/__spark_libs__*.zip does not exist The file is present in the local file system and accessible by the user launching the Yarn AM. This is happening after the application is submitted to ResourceManager and accepted. Are there any other property that needs to be set other than the stagingDir? I am not sure if its a path issue (file:/spark_stage) or it is trying to look for the file in the HDFS! spark.yarn.stagingDir=file:///spark_stage As per the stack trace, the org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus is called when org.apache.hadoop.yarn.util.FSDownload.call is invoked. at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:640) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:866) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:630) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:452) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
... View more
03-23-2019
01:03 AM
Hi, I would like to understand the behavior of SparkLauncherSparkShellProcess that uses Yarn. Using Kylo (dataLake), when the SparkLauncherSparkShellProcess is launched, why does the RawLocalFileSystem use deprecatedGetFileStatus API? Where does this method look for the file and what permissions? The spark submit is failing with the following exception. I verified the files are present in the staging directory in the local file system of the edge node and the user has read access to the files. However org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus is throwing java.io.FileNotFoundException. The spark master is set to yarn and deployMode is cluster and connects to the EMR cluster from the edge node. Spark Version: 2.3.1 StackTrace: 2019-03-22 22:04:38 DEBUG LauncherServer-3:SparkLauncherSparkShellProcess:270 - Spark Shell client [d6242128-5e08-46d0-9f44-40b787f9d4c4] state changed: CONNECTED 2019-03-22 22:04:38 INFO main:Client:54 - Requesting a new application from cluster with 2 NodeManagers 2019-03-22 22:04:38 INFO main:Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (11712 MB per container) 2019-03-22 22:04:38 INFO main:Client:54 - Will allocate AM container, with 1408 MB memory including 384 MB overhead 2019-03-22 22:04:38 INFO main:Client:54 - Setting up container launch context for our AM 2019-03-22 22:04:38 INFO main:Client:54 - Setting up the launch environment for our AM container 2019-03-22 22:04:38 INFO main:Client:54 - Preparing resources for our AM container 2019-03-22 22:04:38 DEBUG main:HadoopDelegationTokenManager:58 - Service hadoopfs does not require a token. Check your configuration to see if security is disabled or not. 2019-03-22 22:04:38 DEBUG main:HadoopDelegationTokenManager:58 - Service hive does not require a token. Check your configuration to see if security is disabled or not. 2019-03-22 22:04:39 DEBUG main:HadoopDelegationTokenManager:58 - Service hbase does not require a token. Check your configuration to see if security is disabled or not. 2019-03-22 22:04:39 DEBUG main:Client:58 - 2019-03-22 22:04:39 WARN main:Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/tmp/spark-2929fcc8-24df-4e6d-948c-96153dd56ec3/__spark_libs__7437371065422767124.zip -> /spark_stage/.sparkStaging/application_1553289922590_0003/__spark_libs__7437371065422767124.zip 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/usr/lib/spark/jars/kylo-spark-shell-client-v2-0.9.1.3.jar -> /spark_stage/.sparkStaging/application_1553289922590_0003/kylo-spark-shell-client-v2-0.9.1.3.jar 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/usr/lib/spark/jars/hbase-common-1.4.4.jar -> /spark_stage/.sparkStaging/application_1553289922590_0003/hbase-common-1.4.4.jar 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/opt/kylo/kylo-services/conf/log4j.properties -> /spark_stage/.sparkStaging/application_1553289922590_0003/log4j.properties 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/opt/kylo/kylo-services/conf/spark.properties -> /spark_stage/.sparkStaging/application_1553289922590_0003/spark.properties 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/etc/hadoop/conf/hbase-site.xml -> /spark_stage/.sparkStaging/application_1553289922590_0003/hbase-site.xml 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/opt/kylo/ssl/kylo-ui.jks -> /spark_stage/.sparkStaging/application_1553289922590_0003/kylo-ui.jks 2019-03-22 22:04:43 INFO main:Client:54 - Uploading resource file:/tmp/spark-2929fcc8-24df-4e6d-948c-96153dd56ec3/__spark_conf__2249993461795431943.zip -> /spark_stage/.sparkStaging/application_1553289922590_0003/__spark_conf__.zip 2019-03-22 22:04:43 DEBUG main:Client:58 - =============================================================================== 2019-03-22 22:04:43 DEBUG main:Client:58 - YARN AM launch context: 2019-03-22 22:04:43 DEBUG main:Client:58 - user class: com.thinkbiganalytics.spark.SparkShellApp 2019-03-22 22:04:43 DEBUG main:Client:58 - env: 2019-03-22 22:04:43 DEBUG main:Client:58 - CLASSPATH -> /opt/kylo/kylo-services/conf:/opt/kylo/kylo-services/lib/mariadb-java-client-1.5.7.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__ 2019-03-22 22:04:43 DEBUG main:Client:58 - HADOOP_CONF_DIR -> /etc/hadoop 2019-03-22 22:04:43 DEBUG main:Client:58 - SPARK_YARN_STAGING_DIR -> /spark_stage/.sparkStaging/application_1553289922590_0003 2019-03-22 22:04:43 DEBUG main:Client:58 - SPARK_USER -> kylo 2019-03-22 22:04:43 DEBUG main:Client:58 - KYLO_CLIENT_ID -> d6242128-5e08-46d0-9f44-40b787f9d4c2 2019-03-22 22:04:43 DEBUG main:Client:58 - PYTHONHASHSEED -> 0 2019-03-22 22:04:43 DEBUG main:Client:58 - KYLO_CLIENT_SECRET -> *********(redacted) 2019-03-22 22:04:43 DEBUG main:Client:58 - YARN_CONF_DIR -> /etc/hadoop 2019-03-22 22:04:43 DEBUG main:Client:58 - resources: 2019-03-22 22:04:43 DEBUG main:Client:58 - log4j.properties -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/log4j.properties" } size: 2068 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - kylo-ui.jks -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/kylo-ui.jks" } size: 2091 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - __app__.jar -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/kylo-spark-shell-client-v2-0.9.1.3.jar" } size: 28054378 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - spark.properties -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/spark.properties" } size: 4127 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - __spark_libs__ -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/__spark_libs__7437371065422767124.zip" } size: 256569291 timestamp: 1553292283000 type: ARCHIVE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - __spark_conf__ -> resource { port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/__spark_conf__.zip" } size: 102161 timestamp: 1553292283000 type: ARCHIVE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - hbase-common-1.4.4.jar -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/hbase-common-1.4.4.jar" } size: 621182 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - hbase-site.xml -> resource { scheme: "file" port: -1 file: "/spark_stage/.sparkStaging/application_1553289922590_0003/hbase-site.xml" } size: 1529 timestamp: 1553292283000 type: FILE visibility: PRIVATE 2019-03-22 22:04:43 DEBUG main:Client:58 - command: 2019-03-22 22:04:43 DEBUG main:Client:58 - {{JAVA_HOME}}/bin/java -server -Xmx1024m -Djava.io.tmpdir={{PWD}}/tmp '-Dlog4j.configuration=log4j.properties' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.thinkbiganalytics.spark.SparkShellApp' --jar file:/usr/lib/spark/jars/kylo-spark-shell-client-v2-0.9.1.3.jar --arg '--idle-timeout' --arg '900' --arg '--port-max' --arg '45999' --arg '--port-min' --arg '45000' --arg '--server-url' --arg 'https://host:port/proxy/v1/spark/shell/register' --arg '--server-keystore-path' --arg 'xx.jks' --arg '--server-keystore-password' --arg 'yy' --properties-file {{PWD}}/__spark_conf__/__spark_conf__.properties 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr 2019-03-22 22:04:43 DEBUG main:Client:58 - =============================================================================== 2019-03-22 22:04:43 DEBUG main:Client:58 - spark.yarn.maxAppAttempts is not set. Cluster's default value will be used. 2019-03-22 22:04:43 INFO main:Client:54 - Submitting application application_1553289922590_0003 to ResourceManager 2019-03-22 22:04:43 DEBUG LauncherServer-3:SparkLauncherSparkShellProcess:208 - Spark Shell client [d6242128-5e08-46d0-9f44-40b787f9d4c4] application id: application_1553289922590_0003 2019-03-22 22:04:43 DEBUG LauncherServer-3:SparkLauncherSparkShellProcess:270 - Spark Shell client [d6242128-5e08-46d0-9f44-40b787f9d4c4] state changed: SUBMITTED 2019-03-22 22:04:44 INFO main:Client:54 - Application report for application_1553289922590_0003 (state: ACCEPTED) 2019-03-22 22:04:44 DEBUG main:Client:58 - client token: N/A diagnostics: [Fri Mar 22 22:04:44 +0000 2019] Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:23424, vCores:16> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1553292283963 final status: UNDEFINED tracking URL: http://fqdn:20888/proxy/application_1553289922590_0003/ user: kylo 2019-03-22 22:04:45 INFO main:Client:54 - Application report for application_1553289922590_0003 (state: FAILED) 2019-03-22 22:04:45 DEBUG main:Client:58 - client token: N/A diagnostics: Application application_1553289922590_0003 failed 2 times due to AM Container for appattempt_1553289922590_0003_000002 exited with exitCode: -1000 Failing this attempt.Diagnostics: File file:/spark_stage/.sparkStaging/application_1553289922590_0003/__spark_libs__7437371065422767124.zip does not exist java.io.FileNotFoundException: File file:/spark_stage/.sparkStaging/application_1553289922590_0003/__spark_libs__7437371065422767124.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:640) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:866) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:630) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:452) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1840) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
-
Apache YARN