Support Questions
Find answers, ask questions, and share your expertise

getting error while submitting spark job

Solved Go to solution

getting error while submitting spark job

getting error while submitting spark job from command line

Spark Streaming's Kafka libraries not found in class path. Try one of the following. 1. Include the Kafka library and its dependencies with in the spark-submit command as $ bin/spark-submit --packages org.apache.spark:spark-streaming-kafka:1.5.2 ... 2. Download the JAR of the artifact from Maven Central http://search.maven.org/, Group Id = org.apache.spark, Artifact Id = spark-streaming-kafka-assembly, Version = 1.5.2. Then, include the jar in the spark-submit command as $ bin/spark-submit --jars <spark-streaming-kafka-assembly.jar> ...

the python code i am running is:

from pyspark.sql import SQLContext from pyspark import SparkContext, SparkConf from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils import json sc = SparkContext(appName="Clickstream_kafka") stream = StreamingContext(sc, 2) kafka_stream = KafkaUtils.createStream(stream,"172.16.10.13:2181","raw-event-streaming-consumer",{"event":1}) parsed = kafka_stream.map(lambda (k, v): json.loads(v)) print(parsed.collect()) stream.start() stream.awaitTermination()

1 ACCEPTED SOLUTION

Accepted Solutions

Re: getting error while submitting spark job

with spark-submit option --jar, are you passing spark-kafka-assembly jar along with kafka_2.10-*.jar from /usr/hdp/2.4.0.0-169/kafka/libs/ location.

View solution in original post

3 REPLIES 3

Re: getting error while submitting spark job

with spark-submit option --jar, are you passing spark-kafka-assembly jar along with kafka_2.10-*.jar from /usr/hdp/2.4.0.0-169/kafka/libs/ location.

View solution in original post

Re: getting error while submitting spark job

i am only running like this -

spark-submit <file_name.py>

Re: getting error while submitting spark job

The spark job ran fine now. I used

spark-submit --jars spark-assembly-1.5.2.2.3.4.7-4-hadoop2.7.1.2.3.4.7-4.jar,spark-streaming-kafka-assembly_2.10-1.6.1.jar <file.py>