Member since
06-14-2016
4
Posts
0
Kudos Received
0
Solutions
06-21-2016
10:34 AM
If anyone else runs into this problem, I finally solved it. I removed the CDH spark package and downloaded it from http://spark.apache.org/downloads.html. After that everything works fine. Not sure what the issues was with the CDH version.
... View more
06-16-2016
12:51 PM
not sure if this will help, but here's the output when I launch spark-shell. Ivy Default Cache set to: /home/deandaj/.ivy2/cache
The jars for the packages stored in: /home/deandaj/.ivy2/jars
:: loading settings :: url = jar:file:/usr/lib/spark/lib/spark-assembly-1.6.0-cdh5.7.1-hadoop2.6.0-cdh5.7.1.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-avro_2.10 added as a dependency
org.apache.avro#avro-mapred added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found com.databricks#spark-avro_2.10;2.0.1 in local-m2-cache
found org.apache.avro#avro;1.7.6 in central
found org.codehaus.jackson#jackson-core-asl;1.9.13 in local-m2-cache
found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in local-m2-cache
found com.thoughtworks.paranamer#paranamer;2.3 in central
found org.xerial.snappy#snappy-java;1.0.5 in central
found org.apache.commons#commons-compress;1.4.1 in central
found org.tukaani#xz;1.0 in central
found org.slf4j#slf4j-api;1.6.4 in local-m2-cache
found org.apache.avro#avro-mapred;1.7.7 in local-m2-cache
found org.apache.avro#avro-ipc;1.7.7 in local-m2-cache
found org.apache.avro#avro;1.7.7 in central
found org.mortbay.jetty#jetty;6.1.26 in local-m2-cache
found org.mortbay.jetty#jetty-util;6.1.26 in local-m2-cache
found io.netty#netty;3.4.0.Final in local-m2-cache
found org.apache.velocity#velocity;1.7 in local-m2-cache
found commons-collections#commons-collections;3.2.1 in local-m2-cache
found commons-lang#commons-lang;2.4 in local-m2-cache
found org.mortbay.jetty#servlet-api;2.5-20081211 in local-m2-cache
:: resolution report :: resolve 6814ms :: artifacts dl 8ms
:: modules in use:
com.databricks#spark-avro_2.10;2.0.1 from local-m2-cache in [default]
com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
commons-collections#commons-collections;3.2.1 from local-m2-cache in [default]
commons-lang#commons-lang;2.4 from local-m2-cache in [default]
io.netty#netty;3.4.0.Final from local-m2-cache in [default]
org.apache.avro#avro;1.7.7 from central in [default]
org.apache.avro#avro-ipc;1.7.7 from local-m2-cache in [default]
org.apache.avro#avro-mapred;1.7.7 from local-m2-cache in [default]
org.apache.commons#commons-compress;1.4.1 from central in [default]
org.apache.velocity#velocity;1.7 from local-m2-cache in [default]
org.codehaus.jackson#jackson-core-asl;1.9.13 from local-m2-cache in [default]
org.codehaus.jackson#jackson-mapper-asl;1.9.13 from local-m2-cache in [default]
org.mortbay.jetty#jetty;6.1.26 from local-m2-cache in [default]
org.mortbay.jetty#jetty-util;6.1.26 from local-m2-cache in [default]
org.mortbay.jetty#servlet-api;2.5-20081211 from local-m2-cache in [default]
org.slf4j#slf4j-api;1.6.4 from local-m2-cache in [default]
org.tukaani#xz;1.0 from central in [default]
org.xerial.snappy#snappy-java;1.0.5 from central in [default]
:: evicted modules:
org.apache.avro#avro;1.7.6 by [org.apache.avro#avro;1.7.7] in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 19 | 4 | 4 | 1 || 18 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: ERRORS
unknown resolver null
unknown resolver null
unknown resolver null
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
:: retrieving :: org.apache.spark#spark-submit-parent
confs: [default]
0 artifacts copied, 18 already retrieved (0kB/9ms)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
/_/
Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91)
Type in expressions to have them evaluated.
Type :help for more information.
16/06/16 12:50:12 WARN util.Utils: Your hostname, jeff-ubuntu resolves to a loopback address: 127.0.1.1; using 10.104.1.90 instead (on interface eth0)
16/06/16 12:50:12 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/06/16 12:50:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context available as sc (master = yarn-client, app id = application_1465929284005_0026).
... View more
06-16-2016
12:48 PM
Unfortunately that did not solve the problem. My coworker who is on a Mac doesn't run into this problem and for the life of me I cannot seem to figure out why my Ubuntu box is having this issue. I can run in local mode just fine. It's only when I try to run it on the cluster that I have this issue.
... View more
06-14-2016
11:07 AM
I keep getting a java.lang.NoClassDefFoundError: org/apache/avro/mapred/AvroWrapper when calling show() on a DataFrame object. I'm attempting to do this through the shell (spark-shell --master yarn). I can see that the shell recognizes the schema when creating the DataFrame object, but if I execute any actions on the data it will always throw the NoClassDefFoundError when trying to instantiate the AvroWrapper. I've tried adding avro-mapred-1.8.0.jar in my $HDFS_USER/lib directory on the cluster and even included it using the --jar option when launching the shell. Neither of these options worked. Any advice would be greatly appreciated. Below is example code: scala> import org.apache.spark.sql._
scala> import com.databricks.spark.avro._
scala> val sqc = new SQLContext(sc)
scala> val df = sqc.read.avro("my_avro_file") // recognizes the schema and creates the DataFrame object
scala> df.show // this is where I get NoClassDefFoundError
... View more
Labels: