08-04-2016 02:10 AM
We are in the process of investigating the key differentiators between the Kafka distributions in Cloudera & Confluent and wanted to get a second opinion on this.
Please could you outline what you believe to be the key distinguishing features with respect to Kafka between Cloudera & Confluent?
Is it just the Kafka versions and products surrounding the open source Kafka implementation, or is there something more that each vendor offers?
08-04-2016 06:21 PM - edited 08-04-2016 06:30 PM
At the lowest common denominator Cloudera and Confluent both offer support for an open source Apache Kafka. Cloudera's CDH typically is a base Apache release + a few patches that their customers have requested. Their current release includes Apache Kafka 0.9.0 + patches. Confluent provides similar packaging but their current release is Apache Kafka 0.10.0. Given that Confluent's main role is to support Kafka, they support a little more of the Kafka ecosystem at the moment. For example the Schema Registry, a REST proxy and non java clients like c and python. Confluent also supports Kafka Connect and Kafka Streams. Correct me if I’m wrong but Cloudera only supports the Kafka Broker and uses Kafka with Hadoop components like Flume and Spark.
Disclamer: I'm a Confluent employee and former Cloudera employee.
08-05-2016 02:22 AM
08-05-2016 08:35 AM
One quick clarification. Kafka 0.9 + has support for SSL. Spark 1.X is using the old consumer API which only supports the PLAINTEXT protocol. Spark 2.0 uses the new consumer api which does have SSL + Kerberos support
CDH will be most likely ship an updated Spark version once Cloudera has finished testing.
Navigator Encrypt is a good option in the Cloudera ecosystem to provide encryption at rest. Confluent provides this functionality by utilizing partnerships with a few encryption at rest vendors.
Most of the patches that Confluent provides originate from customer requests.
08-05-2016 09:24 PM