Reply
New Contributor
Posts: 3
Registered: ‎01-31-2017

Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Hi,

 

Need to use kafka 0.10 for various reasons with CDH 5.9

 

Now CDH 5.9 comes bundled with kafka 0.9 and has the kafka 0.9 jar in SPARK_HOME.

Having the kafka 0.10 jar in the spark fat (uber) jar does not automatically override the one in SPARK_HOME resulting in the following exception:

 

17/01/31 16:32:36 INFO utils.AppInfoParser: Kafka version : 0.9.0-kafka-2.0.0

17/01/31 16:32:36 INFO utils.AppInfoParser: Kafka commitId : unknown

Exception in thread "streaming-start" java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V

 

One way to do this to move the kafka 0.9 jars out of SPARK_HOME. Is there any other way to do this?

I tried using spark.executor.userClassPathFirst but it gives me a different error.

 

 

 

 

Cloudera Employee
Posts: 93
Registered: ‎05-10-2016

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Spark 2 adds support for Kafka 0.10, would it be possible to use Spark 2?  http://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

 

Kafka client has a significant change in .10, so replacing client jars will be difficult.  Using Spark 2 will be the easiest path.

New Contributor
Posts: 3
Registered: ‎01-31-2017

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

[ Edited ]

We are using Spark 2 on CDH 5.10. It was installed using the instructions at http://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

But this version pulls in the 0.9 version of kafka jars. And we need kafka 0.10

 

 

Cloudera Employee
Posts: 93
Registered: ‎05-10-2016

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Yes, you're right, Cloudera does package the 0.9 version of Apache Kafka client libraries with Spark 2.  Ky afka 0.10 brokers are supposed to be able to support the 0.9 client, is there a specific error you are receiving or are you just looking for some of the new features added to the 0.10 client?

 

Unfortontely if you need to 0.10 client version, you will need to replace the 0.9 client jar with the 0.10 client jar as you described or have a seperate spark binary not managed by Cloudera Manager making sure to includ YARN and Hive configurations when launching the unmanaged binaries.

Cloudera Employee
Posts: 2
Registered: ‎09-29-2015

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Based on the documentation "When running jobs that require the new Kafka integration, set SPARK_KAFKA_VERSION=0.10 in the shell before launching spark-submit. Use the appropriate environment variable syntax for your shell"...

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_kafka.html#running_jobs

Announcements