Community Articles

GrazittiAPI · ‎09-13-2017

Hortonworks DataFlow (HDF) includes Apache Nifi with a wealth of processors that make the process of ingesting various syslogs from multiple servers easy. Information collected from the syslog can be stored on the HDFS distributed filesystem as well as forwarded to other systems as Spunk. Furthermore you can parse the stream and select which information should be stored on HDFS and which should be routed to an indexer on Splunk.

To demonstrate this capability let us first review the Nifi ListenSyslog processor:

The above processor corresponds to the syslog configuration in /etc/rsyslog.conf which includes the following line:

...

*.* @127.0.0.1:7780

This will invoke syslog messages to be stream with Nifi flow which we can direct to another processor - PutSplunk, it was configured as follows:

In the spunk UI you can configure data inputs under Setting->Data input -> TCP - Listen on a TCP port for incoming data, e.g. syslog.:

To complete the selection use the port corresponding to the one we configured in the above Nifi putSplunk processor (516)

Follow the next step to configure linux_syslog as follows

At this point you can start the flow and Nifi will ingest linux syslog messages into Spunk.

Once data is received you can search it in Splunk as follows:

To retrieve information from Splunk you can use the GetSplunk processor and connect it to PutFile or PutHDFS processor, as an example I have used the GetSplunk as follows:

For more details on HDF:

https://hortonworks.com/products/data-center/hdf/

Cloudera Community

Community Articles

Nifi Splunk syslog integration

Apache NiFi

Cloudera Data Flow