Created 08-22-2016 08:22 AM
Hello y'all :) When i execute a flow - how can i tell if it is running and what exactly it is doing ? I couldn't find any indication or log so how can i know how the flow progresses and on which step it is at a particular moment. In the UI itself i noticed that only if a step fails - it shows a yellow tool-tip that displays information, but i'd rather have an indication on the current doing. Is there a way to get notified (even in the UI) that a flow is finished ?
Thanks
Adi
Created 08-22-2016 09:39 AM
There are a number of ways you can monitor data running through the flow.
The first is to take a look at the processors in the UI, which will show you the amount of data that has gone through them in the last 5 minutes. You can also right click on the processors, or on connections and select stats, which will graph these values for you and show you the 'pulse' of your data flow.
These stats are also rolled up on a summary stats screen (the little table icon on the top right toolbar which lists all the processors)
Another method is to use the MonitorActivity processor with something like PutEmail to alert you, if for example you haven't received any data in your flow for the last n seconds.
If you're worried about data being queued up, the top of the UI has a handy indicator of how much data is currently queued. Process groups also conveniently show the totals for any queues within them. This can often indicate if there is a bottleneck in your flow somewhere, and how far the data has got through that pipeline.
Another option to check that data has fully passed through your flow is to checkout the data provenance (either in the UI - right click on a processor, data provenance) or via API. This will show you everything that has happened to your data in microscopic detail.
In terms of just getting a simple notification when things are done, I often use the Success relation (for example after PutHDFS) to send out a notification with something like PutEmail, or InvokeHttp to a web service) to notify that the process has completed successfully for a particular piece of data.
Created 08-22-2016 09:39 AM
There are a number of ways you can monitor data running through the flow.
The first is to take a look at the processors in the UI, which will show you the amount of data that has gone through them in the last 5 minutes. You can also right click on the processors, or on connections and select stats, which will graph these values for you and show you the 'pulse' of your data flow.
These stats are also rolled up on a summary stats screen (the little table icon on the top right toolbar which lists all the processors)
Another method is to use the MonitorActivity processor with something like PutEmail to alert you, if for example you haven't received any data in your flow for the last n seconds.
If you're worried about data being queued up, the top of the UI has a handy indicator of how much data is currently queued. Process groups also conveniently show the totals for any queues within them. This can often indicate if there is a bottleneck in your flow somewhere, and how far the data has got through that pipeline.
Another option to check that data has fully passed through your flow is to checkout the data provenance (either in the UI - right click on a processor, data provenance) or via API. This will show you everything that has happened to your data in microscopic detail.
In terms of just getting a simple notification when things are done, I often use the Success relation (for example after PutHDFS) to send out a notification with something like PutEmail, or InvokeHttp to a web service) to notify that the process has completed successfully for a particular piece of data.
Created 08-23-2016 04:46 PM
@Simon Elliston Ball Thanks for the thorough answer. I've tried to use putemail processor for failed relationships - but i didn't get any email even though the step completed with an error. To be more exact - I do receive email for successful steps without a problem, however, i have a flow in which the last step finishes with error but i get no email unless i set the relationship to retry. If i set the relationship to the putemail processor from the last step as success - no email, If i set it to failed - no email if i set to retry - i do receive email.
Isn't "error" means the step failed ? Any idea ?
Adi
Created 08-23-2016 05:02 PM
Right. If you get an error (yellow warning) on some processor, that implies the processor itself has failed in some way, which means the flow file will still be stuck in the queue before the failed processor step. This will be displayed in the nifi-app.log, and of course on the UI, and the summary tables.
Created 08-24-2016 05:54 AM
So if a processor fails with a yellow warning - i am not supposed to get an email ? it is considered as "retry" and not "failure"?