Community Articles
Find and share helpful community-sourced technical articles.

Part 1

Linkedin Post

Part 1 - In case you missed the Introduction to Apache NiFi

Assumption - HDP and NiFi installation is in place. You can use HDP Sandbox if you don't have the cluster. NiFi installation - You can follow Blog 1

End Goal of this tutorial is to display Tweets related to particular search terms. For example: My twitter id is allaboutbdata and following screenshot shows the tweet sent on Twitter and same tweet in HDFS/Hive and Solr. The whole setup was done using NiFi.



Install HDP search: yum install -y lucidworks-hdpsearch

Create user directory in HDFS & changer permissions

sudo -u hdfs hadoop fs -mkdir /user/solr sudo -u hdfs hadoop fs -chown solr /user/solr chown -R solr:solr /opt/lucidworks-hdpsearch/solr

Setup Solr su solr cd /opt/lucidworks-hdpsearch/solr/server/solr-webapp/webapp/banana/app/dashboards/ mv default.json default.json.orig wget Important : Must change the hostname if you are not using HDP sandbox , line number 740

Add <str>EEE MMM d HH:mm:ss Z yyyy</str> for tweets timestamp vi /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml

Start Solr in cloud mode and create a collection called tweets export JAVA_HOME=/usr/jdk64/jdk1.8.0_60/jre/ /opt/lucidworks-hdpsearch/solr/bin/solr start -c -z localhost:2181 /opt/lucidworks-hdpsearch/solr/bin/solr create -c tweets -d data_driven_schema_configs -s 1 -rf 1

Download the Twitter NiFi template from here

Import the template by clicking the 3rd icon from the left as show below.

Browse and import the xml file that you downloaded. Click X on the extreme right hand side at the Top to close the popup. Now, Let's load the template.

Click 7th icon from the left side and drag it in the canvas Now, let's configure the Twitter template.

Setup Twitter developer account to create an app. Once done then you need following information for GetTwitter processor.

Start the flow......


Happy Hadoooping!!


Hi Neeraj,

Trying to follow this demo. My dashboard is empty. Also in the PutSolrContentStream processor, there are zero records that are written to output although 64 records have been input. How do I debug to see what is stopping from writing the records into Solr?