Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to run Spark on Cloudera Manager Single User Mode with JDK1.8

avatar
Contributor

I have installed Cloudera Manager using the Single User Mode with Installation Path A, using JDK 1.8. The installation went fine, though when I try to run "spark-shell", I get the error "java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream" as  below:

 

 

org.apache.spark.launcher.app.Driver                         I Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
org.apache.spark.launcher.app.Driver                         I      at org.apache.spark.deploy.SparkSubmitArguments.handle(SparkSubmitArguments.scala:394)
org.apache.spark.launcher.app.Driver                         I      at org.apache.spark.launcher.SparkSubmitOptionParser.parse(SparkSubmitOptionParser.java:163)
org.apache.spark.launcher.app.Driver                         I      at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:97)
org.apache.spark.launcher.app.Driver                         I      at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
org.apache.spark.launcher.app.Driver                         I      at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
org.apache.spark.launcher.app.Driver                         I Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
org.apache.spark.launcher.app.Driver                         I      at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
org.apache.spark.launcher.app.Driver                         I      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
org.apache.spark.launcher.app.Driver                         I      at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
org.apache.spark.launcher.app.Driver                         I      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
org.apache.spark.launcher.app.Driver                         I      ... 5 more

I tried sourcing /etc/conf/spark/spark-env.sh and /etc/hadoop/conf/hadoop-env.sh manually, but did not help. What can I do to resolve this issue? Since I have installed Cloudera in the Single User Mode, is there anything additional that I need to do here?

 

1 ACCEPTED SOLUTION

avatar
Master Guru

HI,

 

It is likely that the Spark client configuration is not found on this host.

Try doing the following:

 

  1. in Cloudera Manager, go to Spark --> Instances --> Add Role Instances.  Add a Gateway to this host
  2. In Cloudera Manager, go to the Spark service and click the "Actions" menu; choose "Deploy Client Configuration"
  3. Test spark-shell

Based on the error, I think that spark-shell is not able to find an updated version of the client configuration.  If you already have a Gateway, then I'd deploy client configuration and see if that helps.

 

Regards,

 

Ben

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

avatar
Contributor

Hadoop is installed. A simple "hadoop version" yields this:

 

Hadoop 2.6.0-cdh5.8.0
Subversion http://github.com/cloudera/hadoop -r 042da8b868a212c843bcbf3594519dd26e816e79
Compiled by jenkins on 2016-07-12T23:02Z
Compiled with protoc 2.5.0
From source with checksum 2b6c31ecc19f118d6e1c822175716b5
This command was run using /opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/jars/hadoop-common-2.6.0-cdh5.8.0.jar

Also, in the last lines it clearly mentions the hadoop-common.jar which is also the jar containg the error I am getting mentioned in the question. 

 

Any other suggestions?

avatar
Master Guru

HI,

 

It is likely that the Spark client configuration is not found on this host.

Try doing the following:

 

  1. in Cloudera Manager, go to Spark --> Instances --> Add Role Instances.  Add a Gateway to this host
  2. In Cloudera Manager, go to the Spark service and click the "Actions" menu; choose "Deploy Client Configuration"
  3. Test spark-shell

Based on the error, I think that spark-shell is not able to find an updated version of the client configuration.  If you already have a Gateway, then I'd deploy client configuration and see if that helps.

 

Regards,

 

Ben