Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

Contributor

Hi,

I have developed a data-flow in Nifi which has a bunch of steps (Fetch files, split, enrich, transform and store on Database). I would like to get Email alert if there are any failures in the data-flow, and a nice email report at the end.

1. How to effectively setup email alerts: Lets say on step 4 of a dataflow there were 100K failures in 1min, since a Geo lookup service was unavailable for 1 min. In such a scenario, how to setup email alerts such that we receive 1 or few alerts.

2. How can we generate Email reports for showing for example how many events were processed, count of failures, etc.

Please feel free to share any other good practices/thoughts/ideas for monitoring dataflows in nifi.

Thanks

5 REPLIES 5

Re: Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

New Contributor

You could send the FlowFiles from the error relationships to ReplaceText, MergeContent, and PutEmail processors to bundle up the errors into a single email every X minutes or X FlowFiles. Failed and successful FlowFiles could be sent to a separate set of ReplaceText, MergeContent, PutEmail, and/or ExecuteScript to create an email report at the end.

The most flexible solution would be to route the success and failure information to a persistent store (e.g. RDBMS, Solr) and use your standard reporting tools (e.g. Tableau, Banana) to produce custom reports and dashboards, but of course, this would be more complex.

Re: Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

Contributor

Thanks for sharing your comments,

I am actually interested in long-term solution i.e a general approach we could utilize for all dataflows (scalable approach).

I really like your idea : 'to route the success and failure information to a persistent store'. Would really appreciate if you could also specify what information to store/index for effective alerts (or what you might be using). One thing I am looking at is bulletins (Error, Warn Info etc). Using REST we could retrieve the bulletins and store/index them for alerts etc. However am open for better options.

Obaid

PS:

REST endpoint: https://:8080/nifi-api/controller/bulletin-board

<strong><bulletinBoardEntity>
<revision>
<clientId>32142c7d-e476-438f-ac75-36643f4f7876</clientId>
</revision>
<bulletinBoard>
<bulletins>
<category>Log Message</category>
<groupId>6e5d9f8f-5a46-43a7-a474-46a3de01e608</groupId>
<id>89</id>
<level>ERROR</level>
<message>
InvokeHTTP[id=a91c88e3-e2c2-471c-a32b-f7bd2fdda14c] Yielding processor due to exception encountered as a source processor: java.net.UnknownHostException: wronghostname.com: unknown error
</message>
<nodeAddress>host:8080</nodeAddress>
<sourceId>a91c88e3-e2c2-471c-a32b-f7bd2fdda14c</sourceId>
<sourceName>InvokeHTTP</sourceName>
<timestamp>16:47:21 UTC</timestamp>
</bulletins>
<bulletins>
<category>Log Message</category>
<groupId>6e5d9f8f-5a46-43a7-a474-46a3de01e608</groupId>
<id>90</id>
<level>ERROR</level>
<message>
InvokeHTTP[id=a91c88e3-e2c2-471c-a32b-f7bd2fdda14c] Yielding processor due to exception encountered as a source processor: java.net.UnknownHostException: wronghostname.com
</message>
<nodeAddress>host:8080</nodeAddress>
<sourceId>a91c88e3-e2c2-471c-a32b-f7bd2fdda14c</sourceId>
<sourceName>InvokeHTTP</sourceName>
<timestamp>16:47:27 UTC</timestamp>
</bulletins>
<bulletins>
<category>Log Message</category>
<groupId>6e5d9f8f-5a46-43a7-a474-46a3de01e608</groupId>
<id>91</id>
<level>ERROR</level>
<message>
InvokeHTTP[id=a91c88e3-e2c2-471c-a32b-f7bd2fdda14c] Yielding processor due to exception encountered as a source processor: java.net.UnknownHostException: wronghostname.com: unknown error
</message>
<nodeAddress>host:8080</nodeAddress>
<sourceId>a91c88e3-e2c2-471c-a32b-f7bd2fdda14c</sourceId>
<sourceName>InvokeHTTP</sourceName>
<timestamp>16:47:33 UTC</timestamp>
</bulletins>
<generated>16:47:39 UTC</generated>
</bulletinBoard>
</bulletinBoardEntity></strong>
Highlighted

Re: Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

New Contributor

The first option would be scalable if you were to create a dedicated process group for alerts and route all status FlowFiles into it. It could be configured once and support all flows.

The second option is especially useful if you have other solutions that need monitoring and alerting. Many operations teams use Solr and Banana for monitoring, alerting, and log search across all production environments. NiFi can be used to feed in log data and metrics to provide a single source of truth and a unified dashboard for everything (e.g. NiFi, web applications, ETL, databases, etc.). MiNiFi can be used to extract log files.

The information you need to index would depend on your needs. NiFi has built in provenance and flow analytics, so try not to reinvent the wheel. I would imagine failure/success, error code, error message, the last processor ID, FlowFileID, lineage duration, and some content specific dimensions (e.g. source system) would be a good place to start. The FlowFileID could then be used for provenance search within NiFi if you need to go deeper.

Re: Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

Contributor

Thanks Sam Hjelmfelt for your response,

- First option is the easiest to implement, where you would need to change the flow to enable alerting. Also not all processors have failure relationships (ListSFTP). Also we endup getting duplicate emails (link).

- Would be great if you could suggest approach for retrieving these metrics: 'failure/success, error code, error message, the last processor ID, FlowFileID, lineage duration, and some content specific dimensions (e.g. source system)'. Do I need to do provenance search through API?

Thanks

Obaid

Re: Apache Nifi - How to productionize dataflows in Nifi (Alerts,Monitoring & reporting)

New Contributor

Can you share now this evolved for you? I have similar needs, trying to take nifi into production and looking for ways to monitor it - alerting through a central monitoring tool if flows are not running as expected.