Created 12-06-2017 07:52 AM
EDIT: I forgot to say I am trying to run Spark 2.2 as an independent service that uses HDP2.6.
Please help I am running out of time!
I run:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue thequeue examples/jars/spark-examples*.jar 10 --executor-cores 4 --num-executors 11 --driver-java-options="-Dhdp.version=2.6.0.3-8" --conf "spark.executor.extraJavaOptions=-Dhdp.version=2.6.0.3-8"
I get this error:
Spark YARN Cluster mode get this error “Could not find or load main class org.apache.spark.deploy.yarn.ApplicationMaster”
I have tried all the fixes I can find except:
1. Classpath issues. Wher do I set this and to waht?
2. This question suggests it may be due to missing jars. Which jars do I need and what do I do with them?
TIA!
Created 12-06-2017 09:10 AM
The problem is that it could not find necessary jar files. put the following statement in spark-env.sh
SPARK_DIST_CLASSPATH="$SPARK_HOME/jars/*"
Created 12-06-2017 10:06 AM
Thanks! Unfortunately it already has that line.
Created 12-06-2017 10:16 AM
Sometimes this happens "spark-assembly" jar is not being deployed correctly in HDFS (or gets corrupted).
So please try to Deploy the "spark-assembly" jar correctly in HDFS and removed users ".sparkStaging" directory
# su - hdfs # hdfs dfs -rm /hdp/apps/x.x.x.x.x/spark/spark-hdp-assembly.jar # hdfs dfs -put /usr/hdp/x.x.x.x/spark/lib/spark-assembly-1.y.y.x.x.x.x-hadoopx.x.x.x.x..jar /hdp/apps/x.x.x.x/spark/spark-hdp-assembly.jar
.
Please replace the x.x.x.x with the version of hdp / component based on your setup.
.
And also please delete the ".sparkStaging" HDFS directory of the user who is running the Job.
# su - testuser # hdfs dfs -rm -r -f .sparkStaging
.
Created 12-06-2017 10:29 AM
Thanks! Sorry forgot to say I am trying to run Spark 2.2 as an independent service that uses HDP2.6. I assum this won't work for it.