Community Articles
Find and share helpful community-sourced technical articles
Cloudera Employee

Hortonworks DataFlow (HDF) includes Apache Nifi with a wealth of processors that make the process of ingesting various syslogs from multiple servers easy. Information collected from the syslog can be stored on the HDFS distributed filesystem as well as forwarded to other systems as Spunk. Furthermore you can parse the stream and select which information should be stored on HDFS and which should be routed to an indexer on Splunk.

To demonstrate this capability let us first review the Nifi ListenSyslog processor:

38625-screen-shot-2017-09-13-at-54053-pm.png

The above processor corresponds to the syslog configuration in /etc/rsyslog.conf which includes the following line:

...

*.* @127.0.0.1:7780

This will invoke syslog messages to be stream with Nifi flow which we can direct to another processor - PutSplunk, it was configured as follows:

38627-nifi-putsplunk.png

In the spunk UI you can configure data inputs under Setting->Data input -> TCP - Listen on a TCP port for incoming data, e.g. syslog.:

38628-splunk-data-inputs.png

To complete the selection use the port corresponding to the one we configured in the above Nifi putSplunk processor (516)

38629-splunk-516.png

Follow the next step to configure linux_syslog as follows

38630-splunk-syslog-linux.png

At this point you can start the flow and Nifi will ingest linux syslog messages into Spunk.

Once data is received you can search it in Splunk as follows:

38631-splunk-search-syslog.png

To retrieve information from Splunk you can use the GetSplunk processor and connect it to PutFile or PutHDFS processor, as an example I have used the GetSplunk as follows:

38632-nifi-getsplunk.png

For more details on HDF:

https://hortonworks.com/products/data-center/hdf/

4,104 Views
Don't have an account?
Version history
Last update:
‎08-17-2019 11:12 AM
Updated by:
Contributors
Top Kudoed Authors