Posts: 90
Registered: ‎11-12-2015
Accepted Solution

Kudu and Elasticsearch

[ Edited ]

Hello, I'm extracting data from ES to Kudu, using this method:


ES => Logstash => Kafka => Flume => SparkStreaming => Kudu



So, there is a way more direct for doing this?.



Cloudera Employee
Posts: 65
Registered: ‎09-28-2015

Re: Kudu and Eslasticsearch

Hi Joaquin,

I'm not very familiar with Elastic Search or Logstash, since they aren't
part of Cloudera's platform. But, once the data is in Kafka, you should be
able to run Spark Streaming directly against Kafka without the Flume in the
middle, as far as I understand. Or, if you already are using Flume, there
is some initial work on a Flume sink that writes directly to Kudu, though
the work is still in progress.

If Logstash is pluggable, maybe you could write a plugin that uses the Kudu
Java API to directly write to Kudu? If you do end up doing this, please put
it on github so we can promote and share your work!


Posts: 90
Registered: ‎11-12-2015

Re: Kudu and Eslasticsearch

Thanks Todd,


I can't take out Flume because I use it to split the data, and I don't have the skills and time to create a plugin. I will continue doing as I do until there is an oficial connector.