Member since
10-14-2014
5
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8545 | 10-16-2014 04:44 PM |
10-16-2014
04:44 PM
I found out that there were dependencies that were not fulfilled by the spark-examples_2.10-1.0.0-cdh5.1.2.jar file. As it was mentioned, the spark-streaming-kafka libraries were missing as well as the kafka libraries themselves. I zipped these jars up into a single zip file and used the --archives option and it is now working. Doesn't seem like it would be too much work to include the spark-streaming-kafka and kafka libraries with the spark-examples jar, but I have not tried to create an 'uber jar' with all the libraries.
... View more
10-14-2014
05:21 PM
I am running Spark using the CDH5 client packages on Ubuntu 12.04. I was trying to get the JavaKafkaWordCount example working in the /usr/lib/spark/examples/lib/spark-examples_2.10-1.0.0-cdh5.1.2.jar running on YARN, but I got an error that I can't seem to resolve (as a side note, I was able to get the SparkPi.scala example to work). I noticed that the spark-streaming-kafka library is not available by default in the CDH5 packages, but I did find that it is available here: https://repository.cloudera.com/cloudera/public/org/apache/spark/spark-streaming-kafka_2.10/1.0.0-cdh5.1.2/. So, I downloaded the spark-streaming-kafka JAR file and I ran the following command: spark-submit --jars spark-streaming-kafka_2.10-1.0.0-cdh5.1.2.jar --class org.apache.spark.examples.streaming.JavaKafkaWordCount spark-examples_2.10-1.0.0-cdh5.1.2.jar my_kafka_host:my_kafka_port my_consumer_group my_kafka_topic 1 but I got this error: INFO cluster.YarnClientClusterScheduler: YarnClientClusterScheduler.postStartHook done
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils
at org.apache.spark.examples.streaming.JavaKafkaWordCount.main(JavaKafkaWordCount.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 8 more I also tried adding the spark-streaming-kafka JAR file to the 'spark.yarn.dist.files' directive in my 'spark-defaults.conf' file. That did not work either. Anyone know how to resolve this error?
... View more
Labels:
- Labels:
-
Apache Spark
10-14-2014
05:02 PM
I got the same error trying to run Spark on YARN. I fixed it by copying /usr/lib/hadoop/client/hadoop-mapreduce-client-core.jar into HDFS, and then putting that file in my /etc/spark/conf/spark-defaults.conf file for the 'spark.yarn.dist.files' directive: spark.yarn.dist.files /my/path/on/hdfs/hadoop-mapreduce-client-core.jar
... View more