Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

NoClassDefFoundError when using avro in spark-shell (CDH 5.6)


I keep getting a

java.lang.NoClassDefFoundError: org/apache/avro/mapred/AvroWrapper

when calling show() on a DataFrame object. I'm attempting to do this through the shell (spark-shell --master yarn). I can see that the shell recognizes the schema when creating the DataFrame object, but if I execute any actions on the data it will always throw the NoClassDefFoundError when trying to instantiate the AvroWrapper. I've tried adding avro-mapred-1.8.0.jar in my $HDFS_USER/lib directory on the cluster and even included it using the --jar option when launching the shell. Neither of these options worked. Any advice would be greatly appreciated. Below is example code:


scala> import org.apache.spark.sql._
scala> import com.databricks.spark.avro._
scala> val sqc = new SQLContext(sc)
scala> val df ="my_avro_file") // recognizes the schema and creates the DataFrame object
scala> // this is where I get NoClassDefFoundError



Rising Star

Try starting spark-shell with following packages:


--packages com.databricks:spark-avro_2.10:2.0.1,org.apache.avro:avro-mapred:1.7.7



Unfortunately that did not solve the problem. My coworker who is on a Mac doesn't run into this problem and for the life of me I cannot seem to figure out why my Ubuntu box is having this issue. I can run in local mode just fine. It's only when I try to run it on the cluster that I have this issue.


not sure if this will help, but here's the output when I launch spark-shell.


Ivy Default Cache set to: /home/deandaj/.ivy2/cache
The jars for the packages stored in: /home/deandaj/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/lib/spark-assembly-1.6.0-cdh5.7.1-hadoop2.6.0-cdh5.7.1.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-avro_2.10 added as a dependency
org.apache.avro#avro-mapred added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
	confs: [default]
	found com.databricks#spark-avro_2.10;2.0.1 in local-m2-cache
	found org.apache.avro#avro;1.7.6 in central
	found org.codehaus.jackson#jackson-core-asl;1.9.13 in local-m2-cache
	found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in local-m2-cache
	found com.thoughtworks.paranamer#paranamer;2.3 in central
	found org.xerial.snappy#snappy-java;1.0.5 in central
	found org.apache.commons#commons-compress;1.4.1 in central
	found org.tukaani#xz;1.0 in central
	found org.slf4j#slf4j-api;1.6.4 in local-m2-cache
	found org.apache.avro#avro-mapred;1.7.7 in local-m2-cache
	found org.apache.avro#avro-ipc;1.7.7 in local-m2-cache
	found org.apache.avro#avro;1.7.7 in central
	found org.mortbay.jetty#jetty;6.1.26 in local-m2-cache
	found org.mortbay.jetty#jetty-util;6.1.26 in local-m2-cache
	found io.netty#netty;3.4.0.Final in local-m2-cache
	found org.apache.velocity#velocity;1.7 in local-m2-cache
	found commons-collections#commons-collections;3.2.1 in local-m2-cache
	found commons-lang#commons-lang;2.4 in local-m2-cache
	found org.mortbay.jetty#servlet-api;2.5-20081211 in local-m2-cache
:: resolution report :: resolve 6814ms :: artifacts dl 8ms
	:: modules in use:
	com.databricks#spark-avro_2.10;2.0.1 from local-m2-cache in [default]
	com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
	commons-collections#commons-collections;3.2.1 from local-m2-cache in [default]
	commons-lang#commons-lang;2.4 from local-m2-cache in [default]
	io.netty#netty;3.4.0.Final from local-m2-cache in [default]
	org.apache.avro#avro;1.7.7 from central in [default]
	org.apache.avro#avro-ipc;1.7.7 from local-m2-cache in [default]
	org.apache.avro#avro-mapred;1.7.7 from local-m2-cache in [default]
	org.apache.commons#commons-compress;1.4.1 from central in [default]
	org.apache.velocity#velocity;1.7 from local-m2-cache in [default]
	org.codehaus.jackson#jackson-core-asl;1.9.13 from local-m2-cache in [default]
	org.codehaus.jackson#jackson-mapper-asl;1.9.13 from local-m2-cache in [default]
	org.mortbay.jetty#jetty;6.1.26 from local-m2-cache in [default]
	org.mortbay.jetty#jetty-util;6.1.26 from local-m2-cache in [default]
	org.mortbay.jetty#servlet-api;2.5-20081211 from local-m2-cache in [default]
	org.slf4j#slf4j-api;1.6.4 from local-m2-cache in [default]
	org.tukaani#xz;1.0 from central in [default]
	org.xerial.snappy#snappy-java;1.0.5 from central in [default]
	:: evicted modules:
	org.apache.avro#avro;1.7.6 by [org.apache.avro#avro;1.7.7] in [default]
	|                  |            modules            ||   artifacts   |
	|       conf       | number| search|dwnlded|evicted|| number|dwnlded|
	|      default     |   19  |   4   |   4   |   1   ||   18  |   0   |

:: problems summary ::
	unknown resolver null

	unknown resolver null

	unknown resolver null

:: retrieving :: org.apache.spark#spark-submit-parent
	confs: [default]
	0 artifacts copied, 18 already retrieved (0kB/9ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.0

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91)
Type in expressions to have them evaluated.
Type :help for more information.
16/06/16 12:50:12 WARN util.Utils: Your hostname, jeff-ubuntu resolves to a loopback address:; using instead (on interface eth0)
16/06/16 12:50:12 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/06/16 12:50:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context available as sc (master = yarn-client, app id = application_1465929284005_0026).


If anyone else runs into this problem, I finally solved it. I removed the CDH spark package and downloaded it from After that everything works fine. Not sure what the issues was with the CDH version.

Rising Star

This looks strange. Your console output listed the below lines



com.databricks#spark-avro_2.10 added as a dependency
org.apache.avro#avro-mapred added as a dependency


Can you try once with :



--packages com.databricks:spark-avro_2.10:1.0.0,org.apache.avro:avro-mapred:1.6.3


I can sense some version compatibility issues of avro-mapred with spark-avro.

New Contributor

Even I am facing the same issue while trying to process a avro file in pyspark


“java.lang.NoClassDefFoundError: org/apache/avro/mapred/AvroWrapper”.


Spark version 1.5.0-cdh5.5.2