Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

New Contributor

Hi,

 

Need to use kafka 0.10 for various reasons with CDH 5.9

 

Now CDH 5.9 comes bundled with kafka 0.9 and has the kafka 0.9 jar in SPARK_HOME.

Having the kafka 0.10 jar in the spark fat (uber) jar does not automatically override the one in SPARK_HOME resulting in the following exception:

 

17/01/31 16:32:36 INFO utils.AppInfoParser: Kafka version : 0.9.0-kafka-2.0.0

17/01/31 16:32:36 INFO utils.AppInfoParser: Kafka commitId : unknown

Exception in thread "streaming-start" java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V

 

One way to do this to move the kafka 0.9 jars out of SPARK_HOME. Is there any other way to do this?

I tried using spark.executor.userClassPathFirst but it gives me a different error.

 

 

 

 

4 REPLIES 4

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Expert Contributor

Spark 2 adds support for Kafka 0.10, would it be possible to use Spark 2?  http://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

 

Kafka client has a significant change in .10, so replacing client jars will be difficult.  Using Spark 2 will be the easiest path.

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

New Contributor

We are using Spark 2 on CDH 5.10. It was installed using the instructions at http://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html

But this version pulls in the 0.9 version of kafka jars. And we need kafka 0.10

 

 

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Expert Contributor

Yes, you're right, Cloudera does package the 0.9 version of Apache Kafka client libraries with Spark 2.  Ky afka 0.10 brokers are supposed to be able to support the 0.9 client, is there a specific error you are receiving or are you just looking for some of the new features added to the 0.10 client?

 

Unfortontely if you need to 0.10 client version, you will need to replace the 0.9 client jar with the 0.10 client jar as you described or have a seperate spark binary not managed by Cloudera Manager making sure to includ YARN and Hive configurations when launching the unmanaged binaries.

Highlighted

Re: Using kafka 0.10 with CDH 5.9 Spark 2.0 cloudera beta

Cloudera Employee

Based on the documentation "When running jobs that require the new Kafka integration, set SPARK_KAFKA_VERSION=0.10 in the shell before launching spark-submit. Use the appropriate environment variable syntax for your shell"...

https://www.cloudera.com/documentation/spark2/latest/topics/spark2_kafka.html#running_jobs