Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Explorer

Dear all,

 

I have a CDH 5.7 test system running and when I want to execute spark-shell or pyspark, the following error appears:

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:117)
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:117)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.deploy.SparkSubmitArguments.mergeDefaultSparkProperties(SparkSubmitArguments.scala:117)
	at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:103)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 7 more

I have checked the environment and cannot find any problems:

 

env | grep SPARK
SPARK_HOME=/usr/lib/spark
SPARK_CONF_DIR=/etc/spark/conf
SPARK_LIBRARY_PATH=/usr/lib/spark/lib
SPARK_LAUNCH_WITH_SCALA=0
PYSPARK_ARCHIVES_PATH=local:/usr/lib/spark/python/lib/py4j-0.9-src.zip,local:/usr/lib/spark/python/lib/pyspark.zip
SPARK_DIST_CLASSPATH=/usr/lib/hadoop
SPARK_JAR_HDFS_PATH=

Any ideas?

Thanks in advance.

5 REPLIES 5

Re: CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Explorer
As additional comment: also source /etc/spark/conf/spark-env.sh did not change anything

Re: CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Hi, I am also getting the same issue, tried all these path settings as well. But still getting same exception. Let me know if you find any solutions for this. Thanks,

Ravi Papisetti

Highlighted

Re: CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Yes, I'm getting the same error too while running an R job using Oozie Shell Action. Did you guys find any solution/workaround?

Here's what I see in my stdout:

Stdoutput > myHadoopCluster = rxSparkConnect(reset = TRUE, namenode = "default", port = 0, hdfsShareDir = paste( "/user/RevoShare", Sys.info()[["user"]],     sep="/" ), shareDir = "/tmp")
Stdoutput Error: error while running getHadoopEnvVars.py
Stdoutput Warning: /home//RevoHadoopEnvVars.site file not found, skipping
Stdoutput Warning: running as user1_1 with home /home/
Stdoutput ERROR: Fail to execute spark-submit. Last 10 lines' log:
Stdoutput at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Stdoutput Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
Stdoutput at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
Stdoutput at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
Stdoutput at java.security.AccessController.doPrivileged(Native Method)
Stdoutput at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
Stdoutput at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
Stdoutput at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
Stdoutput at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Stdoutput ... 5 more
Exit code of the Shell command 1

 

Re: CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Super Collaborator

Hi Guys,

 

Did you manage to solve it and how?

Re: CHD 5.7 spark-shell java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream

Cloudera Employee

Hi,

 

Did you tried adding the spark jars to the classpath as well and checked it?

 

 

Thanks

AK

Don't have an account?
Coming from Hortonworks? Activate your account here