Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (3)
Cloudera Employee

In the previous article, we saw how to stream tweets using NiFi, Kafka, Tranquility, Druid and Superset ...

https://community.hortonworks.com/articles/177561/streaming-tweets-with-nifi-kafka-tranquility-druid...

You have to implement in that previous article, the part of Druid datasource and Nifi flow to carry on here.

But life already is hard enough, why not simplify it?

The idea here is to perform the same streaming but now integrating Nifi directly to Druid.

So, our new diagram would look like this:

67479-1.png

As we saw in that article, we have Tranquility as an integrating factor between Kafka and Druid. Some people asked me: Why not use Kafka Indexing Service instead of Tranquility?

My answer: because Tranquility as a framework, can be used flexibly, doing integration of almost any component to the Druid.

Thus, on this nifi-druid integration, we will build a custom Nifi processor, which uses Tranquility to integrate data directly into Druid.


Ok, it’s time to hands on!

Let's divide this work into 3 parts:

  1. Build druid processor
  2. Deploy it
  3. Set it up on Nifi

1. Build druid processor

Here is how you can quickly check if you have them installed

$ mvn -version

67475-2.png

$ java -version

67476-3.png

If these ones are not installed:

https://maven.apache.org/install.html

https://www.java.com/

  • 1.3 – Create nar file for your processor
cd <Home Dir>/nifi-druid-integration/fieldeng-nifi-druid-integration-master

mvn install

67477-4.png

Once maven install is done you will have the nar file at the target directory with name nifi-druid-bundle-nar-0.0.1-SNAPSHOT.nar

cd <Home Dir>/nifi-druid-integration/fieldeng-nifi-druid-integration-master/nifi-druid-bundle-nar/target$ ls

nifi-druid-bundle-nar-0.0.1-SNAPSHOT.nar

2. Deploy it - It is a cinch.

Copy your nifi-druid-bundle-nar-0.0.1-SNAPSHOT.narfile for Nifi Libs: you can use something like that:

sudo scp -i yourkeyfile.pem
/Users/tsantiago/Desktop/fieldeng-nifi-druid-integration-master/nifi-druid-bundle-nar/target/nifi-druid.nar
centos@thiago-6.field.hortonworks.com:/usr/hdf/current/nifi/lib/

restart your nifi – and that’s it!

3. Set it up on Nifi

After restarting Nifi you will get a fresh processor:

67478-5.png

Replace the last one step of that flow (putKafka) for PutDruidProcessor:

67480-6.png

And then configure it:

67481-7.png

You must fill this properties on Controller Service:

67482-8.png

67483-9.png

data_source: twitter_demo

zk_connect_string: thiago-2.field.hortonworks.com:2181,thiago-3.field.hortonworks.com:2181,thiago-4.field.hortonworks.com:2181

dimensions_list: tweet_id,created_unixtime,created_time,lang,location,displayname,time_zone,msg

aggregators_descriptor

[  
   {  
      "type":"count",
      "name":"count"
   },
   {  
      "name":"value_sum",
      "type":"doubleSum",
      "fieldName":"value"
   },
   {  
      "fieldName":"value",
      "name":"value_min",
      "type":"doubleMin"
   },
   {  
      "type":"doubleMax",
      "name":"value_max",
      "fieldName":"value"
   }
]

Finally, push start on this new flow and see your superset being filled

Conclusion:

Now, you not only know how to build a custom nifi processor, but also how to integrate Nifi to Druid.

It's Important to say that playing Nifi straight to druid, we can lose some scalability and application resilience, once that in high volume of tweets, the integration between Nifi and druid can become a bottleneck.

However, if your workload is not heavy, maybe keep it simple can be the best option.

References:

https://community.hortonworks.com/content/kbentry/177561/streaming-tweets-with-nifi-kafka-tranquilit...

https://community.hortonworks.com/articles/4318/build-custom-nifi-processor.html

https://github.com/hortonworks/fieldeng-nifi-druid-integration


imagem1.png
3,168 Views
Comments
Not applicable

Hi, the DruidTranquilityController gets stuck at enabling and never gets enabled, could you please let me know how to resolve that??

New Contributor

can you provide the XML of this flow?


Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 08:09 AM
Updated by:
 
Contributors
Top Kudoed Authors