Support Questions

Find answers, ask questions, and share your expertise

Apache Nifi Queue monitoring and Alerting

avatar
Explorer

Hi everyone,

I have a requirement for Apache Nifi. In a nifi flow if there are files stuck in queue for more than some specified time let say 5 mins , I should get an email saying that file is stuck in the nifi flow.

Is that possible ? Please let me know if there is any possible solution.

Thanks in advance !!

3 REPLIES 3

avatar
Expert Contributor

I believe we can determine this by analyzing the log. For instance, if a job is stuck in a queue, we can check the log for specific statuses such as "suspended" or "wait." If such statuses are found, we can trigger an email notification.

Shakib M.

avatar
New Contributor

Hello,

Hi everyone, I have a requirement for Apache NiFi. In a NiFi flow, if files are stuck in a queue for more than a specified time, I should receive an email notification indicating that a file is stuck in the NiFi flow.

avatar
Master Mentor

@udayAle 

Some NiFi Processors process FlowFiles one at a time and other may process batches of FlowFiles in a single thread execution.    Then there are processors like the MergeContent and MergeRecord that allocate FlowFiles to bins and then only merges that bin once the min criteria is met to merge.  

With non merge type processors, a FlowFile that becomes results in a hung thread or long thread execution would block processing of FlowFiles next in queue.  

For Merge type processors, depending on data volumes and configuration 5 mins might be expected behavior (of your you could set a max bin age of 5 mins to force a bin to merge even if mins have not been satisfied).

So i think there are two approaches to look at here.  One monitors long running threads and the the other looks as failures.

  • Runtime Monitoring Properties: When configured this background process checks for long running threads and produces log output and NiFi Bulletins when a thread exceeds a threshold.  You could build an alerting dataflow around this using the SiteToSiteBulletinReportingTask, some Routing processors(to filter specific types of bulletins related to long running tasks) and then an email processor.

  • The majority of processors that have potential for failures to occur will have a failure relationship.  You can build a dataflow using that failure relationship to alert on those failures.  Consider a failure relationship routed to an update attribute that use the advanced UI to increment a failure counter that then feeds a routeOnAttribute processor that handles routing base on number of failed attempts.  After x number of failures it could send an email via putEmail.

Apache NiFi does not have a background "Queued Duration" monitoring capability.  Programmatically building one would be expensive resource wise.  As you would need to monitor every single constantly changing connection and parse out and FlowFile with a "Queued Duration" in excess of X amount of time.   Consider a Processor that is hung, the connection would continue to grow until backpressure kicks in and forces upstream processor to start queueing.  You could end up with 10,000 FlowFiles alerting on queued duration. 

Hopefully this helps you maybe to look at the use case a little differently.   Keep in mind that all monitoring including examples I provided will have impact on performance.

Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt