Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

KafkaConsumer.subscribe 0.9 vs 0.10 in Structured streaming

Highlighted

KafkaConsumer.subscribe 0.9 vs 0.10 in Structured streaming

Master Collaborator

Hi I am trying to run a structured streaming on CDH 5.11.1 with Kafka 2.2.0-1.2.2.0.p0.68.

The streaming fails that there is no subscribe method with this definition:

 

org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)

 

Found out that this is due the change between 0.9 and 0.10 version. 

My project contains a sbt pointing to a 0.10 artifacts,

libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-10_2.11" % "2.2.0"
libraryDependencies += "org.apache.spark" % "spark-sql-kafka-0-10_2.11" % "2.2.0"
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.10.2.1"

 

the CDH should use also Kafka 0.10. The spark-submit job has an explicit parameter:

 

spark-submit ... --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0,org.apache.kafka:kafka-clients:0.10.2.1

 

so I dont understand where the 0.9 comes from, and why it expects the older API.

 

 

17/09/20 14:48:07 ERROR streaming.StreamExecution: Query [id = 60109f59-c6fc-4956-b559-16b41570e45a, runId = 8c86c4a5-f510-4269-a33e-fb19390cb054] terminated with error
java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V
at org.apache.spark.sql.kafka010.SubscribeStrategy.createConsumer(ConsumerStrategy.scala:63)

 

1 REPLY 1

Re: KafkaConsumer.subscribe 0.9 vs 0.10 in Structured streaming

Contributor

Hi I had the same problem, but I've copied solution from another problem, where I was setting kerberized Kafka with Sentry. To run Spark programm with kerberized kafka you have to switch spark tu use Kafka 0.10. You can do this in 2 ways:

1. Before spark2-submit you have to export kafka version
$ export SPARK_KAFKA_VERSION=0.10

$ spark2-submit ...

 

2. Set default Kafka version in spark2 service configuration in Cloudera Manager

 

I've also run structured streaming ETL and had the same problem as you with version 0.9. After I've set export my programm started properly.

 

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_kafka.html

Don't have an account?
Coming from Hortonworks? Activate your account here