Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

How to run Kafka in spark

avatar
New Contributor

I am trying this command

spark-submit --master local[4] --class com.oreilly.learningsparkexamples.scala.KafkaInput $ASSEMBLY_JAR localhost:2181 spark-readers pandas 1

but i get the following error

Exception in thread "Thread-73" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Missing required property 'groupid'

1 ACCEPTED SOLUTION

avatar
Guru

This sounds like it may be a build problem.

https://github.com/simonellistonball/spark-samples...

has a working sample with sbt scripts to build against the Hortonworks repository, which has been testing on HDP 2.3.2.

Note that the Kafka consumer API has changed a bit recently, so it's important to be aware of versions in Kafka.

Also, I note that you're running in local model, we would recommend that you only use local mode for testing, and that you use --master yarn-client for running on a proper cluster.

View solution in original post

5 REPLIES 5

avatar
Guru

This sounds like it may be a build problem.

https://github.com/simonellistonball/spark-samples...

has a working sample with sbt scripts to build against the Hortonworks repository, which has been testing on HDP 2.3.2.

Note that the Kafka consumer API has changed a bit recently, so it's important to be aware of versions in Kafka.

Also, I note that you're running in local model, we would recommend that you only use local mode for testing, and that you use --master yarn-client for running on a proper cluster.

avatar
Rising Star

@Simon Elliston Ball do you have a sample code for java?

avatar

avatar
Rising Star

@Neeraj Sabharwalthanks that is a great tutorial but, that example is in scala as well.

avatar
Labels