Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
Rising Star

Description

Learn how to consume real-time data from the Satori RTM platform using NiFi.

Background

Satori is a cloud-based live platform that provides a publish-subscribe messaging service called RTM, and also makes available a set of free real-time data feeds as part of their Open Data Channels initiative:

https://www.satori.com/docs/using-satori/overview
https://www.satori.com/opendata/channels

This article steps through how to consume from Satori's Open Data Channels in NiFi, using a custom NiFi processor.
Note - the article assumes you already have a working version of NiFi up and running.

Link to code on github: https://github.com/laurencedaluz/nifi-satori-bundle

Installing the custom processor

To create the required nar file, simply clone and build the following repo with maven:

git clone https://github.com/laurencedaluz/nifi-satori-bundle.git
cd nifi-satori-bundle
mvn clean install

This will make the following .nar file under the nifi-satori-bundle-nar/target/ directory:

nifi-satori-bundle-nar-<version>.nar

Copy this file into the lib directory of your NiFi instance. If using HDF, it exists at:

/usr/hdf/current/nifi/lib

Restart NiFi for the nar to be loaded.

Using the ConsumeSatoriRtm processor

The ConsumeSatoriRtm accepts the following configurations:

41573-satori-nifi-configurations.png

At a minimum, you will just need to provide the following configurations (which you can get directly from the satori open channels site):

  • Endpoint
  • Appkey
  • Channel

In this example, I've chosen to consume from the 'big-rss' feed using the configurations provided here: https://www.satori.com/opendata/channels/big-rss

That's it! after starting the ConsumeSatoriRtm process you will see data flowing:

41574-satori-nifi-to-kafka.png

Additional Features

  • The processor also supports using Satori's Streamview filters, which allow you to provide SQL-like queries to select, transform, or aggregate messages from a subscribed channel:
    • In the 'big-rss' example above, the following filter configuration would limit the stream to messages containing the word "jobs".
    select * from `big-rss` where feedURL like '%jobs%'
  • The NiFi processor also supports batching of multiple messages into a single FlowFile, which will provide a new-line delimited list of messages in each file (based on a 'minimum batch size' configuration):

    41575-satori-nifi-bigrss-feed.png

643 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 10:27 AM
Updated by:
 
Contributors
Top Kudoed Authors