Support Questions

Find answers, ask questions, and share your expertise

spark-shell command failing

avatar
Rising Star

Hello Team,

 

We have CDH 5.15.3 installed and Spark2 version installed of 2.3.

 

Kindly confirm below:

 

1. Does CDh 5.15.3 support  Spark2 2.3?

 

2. When running spark-shell on server where Gateway Role installed throws below error

 

[cloudera-scm@a302-0144-2944 cloudera-scm-agent]$ spark-shell
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:123)
at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:123)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.deploy.SparkSubmitArguments.mergeDefaultSparkProperties(SparkSubmitArguments.scala:123)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:109)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

 

3. Any configuration to be done by using Spark on YArn?

 

Kindly help to fix the issue?

 

- VIjay M

6 REPLIES 6

avatar
1. Yes it does
2. You need YARN, HDFS and Spark2 gateway deployed. Something is missing on the host.

avatar
Rising Star

@Tomas79

 

If i run below command before running spark-shell its working.

export SPARK_DIST_CLASSPATH=`hadoop classpath`

 

But i do not want above command to be run every time, Kindly suggest?

 

- VIjay M

 

avatar
New Contributor

Were you able to find the reason. I am also having the same issue with CDH 5.11. Spark and PySpark not working and it works only after i set the class path as you mentioned. Hive and mapreduce works fine.I have Spark 1.6.0 and Scala 2.10.5

avatar
there is a difference between spark and spark2. If you want to use spark2 then you should rub spark2-shell. And if spark 1.x theb you should deploy spark1 gateway role on the host

avatar
Rising Star
Let me try that and will update here

avatar
New Contributor

Yes Spark2-shell and spark2-submit has these issues as well. Any insights