Support Questions
Find answers, ask questions, and share your expertise

Scheduling NiFi Processor to run upon receiving the first flow file of the day

Scheduling NiFi Processor to run upon receiving the first flow file of the day

Explorer

How can I schedule a NiFi processor to run only when it receives the first flow file of the day. 

The processor can ignore all subsequent flowfiles.

6 REPLIES 6

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

Explorer

Any suggestions ?

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

@mbaid 

How can I schedule a NiFi processor to run only when it receives the first flow file of the day?

If you know the time of the first flow file, you can configure a processor with a Scheduling Strategy of CRON driven.

If you don't know the time of the first flow file, the solution would be different depending on which processor would be the first processor in the flow.

 

The processor can ignore all subsequent flowfiles. 

Does this mean the subsequent flow files can be removed?

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

Explorer

@Wynner 

 

I do not know the time of the first flowfile, sorry but couldn't understand why the solution would be different for different processors ?

Can you please suggest something for something like the puthive processor ?

 

The processor can ignore all subsequent flowfiles. 

Does this mean the subsequent flow files can be removed?

 

Yes, the subsequent flowfiles for the day should be removed/ignored for that processor.

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

I can’t help but suggest that Nifi is an always on tool.  My recommendation is run the data always into hive with time stamp.

 

Then in a separate flow branch operating on cron schedule in a once per day manner, select the single result you want (first/oldest) in a given day.

 

Another option is some kind of counter that resets daily.  First hit that increments count runs to hive.  Then based on this counter being greater than 1 route everything away from hive.

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

Explorer

@stevenmatison 

what do when in cases one is not ingesting data, for example someone may be trying to delete or add a daily partition of an external table in hive and running this statement for each flow file would be waste of resources.

 

With which processors would I be able to implement such a counter and how do I manage to reset it every day ?

Re: Scheduling NiFi Processor to run upon receiving the first flow file of the day

Maybe you can define you use Case better.  I was simply going off the limited description.

 

 

Check out the DistributedMapCache Controller Service and associated Processors.  You can use this to store a count which can be check in your flow to control the gating method you want to achieve.