Support Questions
Find answers, ask questions, and share your expertise

Custom Processor in Nifi using event driven scheduling strategy?

Explorer

Need to make a new custom processor in nifi which is using event driven scheduling strategy can anyone suggest any way or links??

Event Driven is only supported in 1.2???

1 ACCEPTED SOLUTION

Accepted Solutions

I believe event-driven is still somewhat experimental... For processors that have in an incoming queue, generally you schedule them with timer-driver and a run schedule of 0 which means run all the time, and then the framework only runs them when the queue has data in it. So if your queue is always empty it won't be using any CPU cycles.

Is there something about your processor that would cause the above approach to not work?

View solution in original post

4 REPLIES 4

I believe event-driven is still somewhat experimental... For processors that have in an incoming queue, generally you schedule them with timer-driver and a run schedule of 0 which means run all the time, and then the framework only runs them when the queue has data in it. So if your queue is always empty it won't be using any CPU cycles.

Is there something about your processor that would cause the above approach to not work?

View solution in original post

Explorer

OnTrigger method calls everytime flowfile is read and i want onTrigger to call on completion of particular event or condition so for that i want to use eventDriven.

The difference between event-driven and timer-driven is the following...

In timer-driven the framework is always checking if there is work to do (i.e. data available in a queue) and then triggering the processor when there is.

In event-driven the framework would take a flow file and pass it directly from the previous processor to the next processor.

So in both cases onTrigger is only being called when a flow file is available, but event-driven would be more efficient for the framework.

New Contributor

Even driven is also very useful (or rather the only solution), in some cases.

For example in our case, we have a sequence of HTTP calls that result in multiple flow files, each flow file is collected and also forks into a separate flow to create a new flow file. This kind of works like a linked list. So what we gotta do is merge all of these individual flow files into a single archive at the end using the MergeContent processor. This must only happen when we've collected all the flow files, we don't really know the number of flow files we may be getting until the last flow file is collected.

So the MergeContent processor should only be triggered when all the flow files are collected, not before that. While the Timer based and CRON based triggers do not have such a provisioning.

We tried to solve this issue with Wait and Notify but apparently, there is no way to block the MergeContent Processor from being triggered until the last flow file has been queued to the MergeContent processor. As flow file transfer and queuing to the MergeContent processor in large number of files say 200,000 files itself takes some time, and it is possible that the MergeContent processor gets triggered halfway through the queuing.