Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Executing application with spark-class

avatar
Explorer

I have a newly installed CDH5 cluster with Spark configured and installed.  I have verified that I can log into the Spark interactive shell, but as of yet I have been unable to submit any spark application via spark-class.  Whenever I do, I get the following exception:

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/client/api/impl/YarnClientImpl
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

The instructions I am following are here:
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/c...

1 ACCEPTED SOLUTION

avatar
Master Collaborator

OK, I meant what are the Maven / SBT deps, but in any event I think Hadoop 0.20.2 is the problem. CDH5 is Hadoop 2.3, and the supplied Spark works with that. Your classpath on your cluster shows you've got Hadoop 0.20.2 classes also in the mix somehow. I don't know where those are coming from? that is the problem.

View solution in original post

8 REPLIES 8

avatar
Master Collaborator

Can you share the exact command you are running?

 

The link you supplied goes through a redirector we can't access, and the important part got cut off. Maybe you can clarify what page you are looking at under the installation guide.

avatar
Explorer

Here is the command I am running:

 

$SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client --jar ~/samples/syn-spark-project-0.0.1-SNAPSHOT.jar --class spark.WorkCountJob --args yarn-standalone --args input/hello.txt --args output.txt

My cluster is setup with Yarn, and the syn-spark-project is a jar that I assembled using the jar stored in the $SPARK_HOME directory on my cluster.

 

I am looking at the "Running Spark Applications" entry in the documentation.

avatar
Master Collaborator

(I think you have a typo in "WorkCountJob" but that's not the issue yet)

 

Did you run:

 

source /etc/spark/conf/spark-env.sh

avatar
Explorer

Yep, source spark-env and set the SPARK_JAR environment variable as the instructions suggested (and I used the same jar during the compilation of my jar),

avatar
Master Collaborator

How did you compile your jar file -- against which Spark and Hadoop deps?

 

It seems like something is missing from the classpath. 

Try executing this first and then re-running:

 

export SPARK_PRINT_LAUNCH_COMMAND=1

 

That ought to make it print the command including classpath.

The error is from the local driver, not the app on the cluster right?

avatar
Explorer

I am compiling the jar again the Hadoop 0.20.2 dependency and the Spark jar that is loaded on the cluster (the same point I am pointing the spark-class command to).

 

I do not think this particular error has anything to do with my compiled jar though, because when I run the spark-class while leaving off the jar argument, it fails with the same error (ie, looks like it does not even get to parsing the arguments). 

 

I executed that with the debug command you supplied, here is what I got:

10:32 AM ~/lib/spark-0.9.0-incubating/bin: export SPARK_PRINT_LAUNCH_COMMAND=1
10:34 AM ~/lib/spark-0.9.0-incubating/bin: $SPARK_HOME/bin/spark-class org.apache.spark.deploy.yarn.Client
Spark Command: java -cp :/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/conf:/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/assembly/lib/*:/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/examples/lib/*:/etc/hadoop/conf:/home/tclay/tools/hadoop-0.20.2/*:/home/tclay/tools/hadoop-0.20.2/../hadoop-hdfs/*:/home/tclay/tools/hadoop-0.20.2/../hadoop-yarn/*:/home/tclay/tools/hadoop-0.20.2/../hadoop-mapreduce/*:/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/lib/scala-library.jar:/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/lib/scala-compiler.jar:/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/lib/jline.jar -Djava.library.path=/opt/cloudera/parcels/CDH-5.0.1-1.cdh5.0.1.p0.47/lib/spark/lib:/home/tclay/tools/hadoop-0.20.2/lib/native -Xms512m -Xmx512m org.apache.spark.deploy.yarn.Client

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/yarn/client/api/impl/YarnClientImpl
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:482)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

 

avatar
Master Collaborator

OK, I meant what are the Maven / SBT deps, but in any event I think Hadoop 0.20.2 is the problem. CDH5 is Hadoop 2.3, and the supplied Spark works with that. Your classpath on your cluster shows you've got Hadoop 0.20.2 classes also in the mix somehow. I don't know where those are coming from? that is the problem.

avatar
Explorer

Perfect - Hadoop home was pointing to the wrong place, that was being picked up.  I am able to submit applications just fine now.  Thanks.