Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

setting cron driven jobs in nifi

setting cron driven jobs in nifi

Explorer

I am using multiple process groups connected with OUT and IN port.

In the first process group I am fetching data from a table that contains timedate data.

Using the time-date fetched, how can I set the cron for the first processor in the next process group?

2 REPLIES 2

Re: setting cron driven jobs in nifi

Explorer

Hi Joel, thanks for the reply. 

Since I am new to nifi, kindly let me know what do you mean by creating function.

All I am doing is fetching the time from the mysql table and inserting it into the Out port.

Next, In the another procsss group, fetching the time and extracting the date, time, etc from IN port and then trying to use it in the form of ${hour} ${minute} as cron. But use of '$' and '{' makes it an invalid cron .

 

Another solution which I came with was using RouteOnAttribute which will compare the current timestamp now() with the timedate fetched and on matching , it will proceed further.

This solution works but it is not an efficient one as the RouteOnattribute needs to continuously run and check the timestamp .

 

Please let me know some other solution.

Re: setting cron driven jobs in nifi

Master Guru

Hello @gari 

 

A NiFi processor will only read in a FlowFile from an inbound connection when it executes, which means the processor has no access to a FlowFiles attributes until it executes.  This makes it impossible for a processor to use a FlowFile's attributes to set when the processor should execute.  

 

You can only use NiFi Expression Language (EL) in component configuration properties that support EL.  None of the configuration setting on the "Settings" or "Scheduling" tabs of a component will support EL. 

 

Also be careful using a "match" type rule in your routeOnAttribute processor dataflow.  What happens if a GC event or threading issue slows down the reprocess loop and the exact "matching' time is missed.  The FlowFile would forever be stuck in the loop.  Maybe consider using a "Less than or equal to" (le) function instead. 

 

FlowFile Attributes live in heap memory, so executions performed by your RouteOnAttribute processor should amount to very little actual CPU usage.  Are you observing a large "time" per 5 minute stat on the processor?  Perhaps adjust the run schedule to on the RouteOnAttribute processor to control how often it checks inbound FlowFiles.  "0 sec" means run as fast as the hardware will allow.  Try changing to ".5 sec" to reduce some of the usage.

Hope this helps,
Matt

Don't have an account?
Coming from Hortonworks? Activate your account here