Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Master Guru

Monitoring Apache NiFi

14305-reportingtasktoganglia.png

14304-reportingtasks.png

It's really important to pick some Reporting Tasks to let you know what's happening in Apache NiFi servers. Ambari will send it to your HDF Ambari which will show the results in nice Grafana graphs, charts and tables.

You can also monitor disk usage, memory and also send tasks to DataDog, Ganglia and Other Servers. It's also easy to write your own Reporting Task if you need a different one.

14306-monitoringflowspluschatops.png

14303-monitoractivitytoslackops.png

One of the ways to monitor your Apache NiFi Data Flows is to use the MonitorActivity processor which will create messages that can be sent to your Operations Dashboard, Console or elsewhere.

For people doing ChatOps, you can easily push these messages to Slack (there's a processor for that) PutSlack. You could also send a REST call to HipChat or other chat tools. Pretty easy to wrap that up in a custom processor as well.

Other Things to Monitor

REST END Points

server:port/nifi-api/system-diagnostics

See: https://nifi.apache.org/docs/nifi-docs/rest-api/

Logs

...nifi/logs/nifi-app.log and

..nifi/logs/nifi-user.log

These can be ingested with Apache NiFi for detailed log processing.

You can filter and send some messages to SumoLogic or elsewhere via Apache NiFi.

See: https://community.hortonworks.com/content/kbentry/67309/routing-logs-through-apache-nifi-to-phoenix-...

13,213 Views
Comments
avatar
Rising Star

PutSlack was such a good addition!

Be careful ingesting nifi-app.log though! I've tried this before and it quickly spirals out of control as each read of the log also generates log entries which then get picked up and generate more log entries.

avatar
Expert Contributor

@Sebastian Carroll IMHO you would need a separate NiFi instance for that purpose, same goes if you want to archive Provenance events from the NiFi instance; another option would be to send the logs to Splunk, etc. for log processing and for any analytics on top of that (dashboard, alerts, etc.)

avatar
Master Guru

Yes, much safer to have another instance you can use for reporting and such. Even if it's just one node.

avatar
New Contributor
As a general best practice, I suggest sending those metrics to an all-together separate monitoring system (something like InfluxDB). You can’t effectively monitor a thingy with the same thing. If that thingy fails… you risk losing visibility. #JustSayin