Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to run Kafka in spark

avatar

I am trying this command

spark-submit --master local[4] --class com.oreilly.learningsparkexamples.scala.KafkaInput $ASSEMBLY_JAR localhost:2181 spark-readers pandas 1

but i get the following error

Exception in thread "Thread-73" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.IllegalArgumentException: Missing required property 'groupid'

1 ACCEPTED SOLUTION

avatar
Guru

This sounds like it may be a build problem.

https://github.com/simonellistonball/spark-samples...

has a working sample with sbt scripts to build against the Hortonworks repository, which has been testing on HDP 2.3.2.

Note that the Kafka consumer API has changed a bit recently, so it's important to be aware of versions in Kafka.

Also, I note that you're running in local model, we would recommend that you only use local mode for testing, and that you use --master yarn-client for running on a proper cluster.

View solution in original post

5 REPLIES 5

avatar
Guru

This sounds like it may be a build problem.

https://github.com/simonellistonball/spark-samples...

has a working sample with sbt scripts to build against the Hortonworks repository, which has been testing on HDP 2.3.2.

Note that the Kafka consumer API has changed a bit recently, so it's important to be aware of versions in Kafka.

Also, I note that you're running in local model, we would recommend that you only use local mode for testing, and that you use --master yarn-client for running on a proper cluster.

avatar
Expert Contributor

@Simon Elliston Ball do you have a sample code for java?

avatar
Master Mentor

avatar
Expert Contributor

@Neeraj Sabharwalthanks that is a great tutorial but, that example is in scala as well.

avatar
Master Mentor