Recently, we wanted to migrate our architecture to CDH 5.14.2 (with Spark 1.6, Kafka 1.0.1 and Scala 2.10). We are having problems migrating our Streaming applications, especially if we want to use the following corresponding libraries (artifacts):
As you can see, we did not find a corresponding artifact for the kafka_2.10 library; Do you have any feedback about that? maybe is it necessary to upgrade also the Scala version (to 2.11), the Spark Version or both?
Someone already had these migration problems with Spark 1.6 and Kafka 1.0.1 and can share a working configuration for CDH 5.14.2?
Spark Streaming in CDH 5.14 uses Apache 0.10.2 based kafka clients, that is Cloudera Kafka 2.2.0, you can find the spark gateway setup the classpath through /etc/spark/conf/classpath.txt :
[root@host-514 ~]# hadoop version
Subversion http://github.com/cloudera/hadoop -r 5724a4ad7a27f7af31aa725694d3df09a68bb213
Compiled by jenkins on 2018-03-27T20:40Z
Compiled with protoc 2.5.0
From source with checksum 302899e86485742c090f626a828b28
This command was run using /opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/hadoop-common-2.6.0-cdh5.14.2.jar
[root@host-514 ~]# cat /etc/spark/conf/classpath.txt |grep kafka |grep -v flume