Support Questions

Find answers, ask questions, and share your expertise

NoClassDefFoundError due to Incompatible Spark Version

avatar
New Contributor

Hi all,

I developed my Spark application using Spark 3.4, and it runs on a different platform. Now, I want to run it on another platform that uses Spark 2.4. When I submit the job using the command below

spark-submit --class Main   --master yarn   --deploy-mode client   --num-executors 6   --executor-memory 10G   --driver-memory 20G   --executor-cores 4   --driver-cores 4   --conf spark.yarn.submit.waitAppCompletion=true   --conf spark.executor.extraJavaOptions="-XX:+UseG1GC -XX:MaxGCPauseMillis=20"   --conf spark.driver.extraJavaOptions="-XX:+UseG1GC -XX:MaxGCPauseMillis=20"   --jars /home/cdsw/mysql-connector-java-5.1.49.jar   /home/cdsw/MyApplication.jar

an error appears as shown:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/connector/catalog/TableProvider
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
        at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
        at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
        at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:43)
        at scala.collection.Iterator$class.foreach(Iterator.scala:891)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$class.filterImpl(TraversableLike.scala:247)
        at scala.collection.TraversableLike$class.filter(TraversableLike.scala:259)
        at scala.collection.AbstractTraversable.filter(Traversable.scala:104)
        at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:648)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:214)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:187)
        at Main$.main(Main.scala:33)
        at Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:847)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:922)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:931)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.connector.catalog.TableProvider
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        ... 52 more

I've downgraded my Spark SQL library to match the version on the platform I want to run, but the problem still persists. FYI, I'm using Scala for my application.

libraryDependencies += "org.json4s" %% "json4s-native" % "4.0.0"
libraryDependencies += "org.json4s" %% "json4s-jackson" % "4.0.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.0"

Please advise

Thanks

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @Ismail_A 

Don't add Spark libraries and its dependent jars as fat jar to your Spark application (MyApplication.jar).

If you are using maven build tool you can specify the scope as provided.

<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
</dependency>

If you are using gradle you can try similar like below

dependencies {
    compileOnly group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.4.0'
}

With the above steps it is not resolved your issue, then run your code by launching spark-shell/pyspark and see it is working or not.

View solution in original post

3 REPLIES 3

avatar
Community Manager

@Ismail_A, Welcome to our community! To help you get the best possible answer, I have tagged in our Spark experts @RangaReddy who may be able to assist you further.

Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Collaborator

Hi @Ismail_A 

Don't add Spark libraries and its dependent jars as fat jar to your Spark application (MyApplication.jar).

If you are using maven build tool you can specify the scope as provided.

<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
</dependency>

If you are using gradle you can try similar like below

dependencies {
    compileOnly group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.4.0'
}

With the above steps it is not resolved your issue, then run your code by launching spark-shell/pyspark and see it is working or not.

avatar
New Contributor

Thanks, @RangaReddy . It solved my problem. 👏