Community Articles
Find and share helpful community-sourced technical articles.
Cloudera Employee

Hortonworks DataFlow (HDF) includes Apache Nifi with a wealth of processors that make the process of ingesting various syslogs from multiple servers easy. Information collected from the syslog can be stored on the HDFS distributed filesystem as well as forwarded to other systems as Spunk. Furthermore you can parse the stream and select which information should be stored on HDFS and which should be routed to an indexer on Splunk.

To demonstrate this capability let us first review the Nifi ListenSyslog processor:


The above processor corresponds to the syslog configuration in /etc/rsyslog.conf which includes the following line:


*.* @

This will invoke syslog messages to be stream with Nifi flow which we can direct to another processor - PutSplunk, it was configured as follows:


In the spunk UI you can configure data inputs under Setting->Data input -> TCP - Listen on a TCP port for incoming data, e.g. syslog.:


To complete the selection use the port corresponding to the one we configured in the above Nifi putSplunk processor (516)


Follow the next step to configure linux_syslog as follows


At this point you can start the flow and Nifi will ingest linux syslog messages into Spunk.

Once data is received you can search it in Splunk as follows:


To retrieve information from Splunk you can use the GetSplunk processor and connect it to PutFile or PutHDFS processor, as an example I have used the GetSplunk as follows:


For more details on HDF:

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 11:12 AM
Updated by:
Top Kudoed Authors