Support Questions

Find answers, ask questions, and share your expertise

Does NiFi evaluate processor properties (with expression language) for each scheduled run or only the first time ?

avatar
Expert Contributor

Hello,

When a NiFi processor property includes expression language and the processor is scheduled to run at certain intervals, does the expression in the property get evaluated for each scheduled run or only once for the first run ?

The reason I'm asking is, I've a GetHDFS processor that's scheduled to run once daily; the 'Directory' property of the processor includes expression language; since I want the processor to point to previous day's directory, I have set the directory property as follows:

/user/nifitest/${now():toNumber():minus(86400000):format('yyyy')}/${now():toNumber():minus(86400000):format('MM')}/${now():toNumber():minus(86400000):format('yyyy_MM_dd')}

The above expression evaluates correctly to a directory that points to one that was created the previous day; for example, today's run (7-12-2017) would point to this directory - /user/nifitest/2017/07/2017_07_11;

After it is scheduled, for the first run, the GetHDFS processor starts at the scheduled time and works perfectly, it processes all the files in the directory from the previous day, but it is not finding any files on subsequent scheduled runs; in the nifi log, I was not able to find the exact directory path to which the processor points to, but below is what it shows in the log;

2017-06-30 08:18:00,000 ERROR [NiFi logging handler] org.apache.nifi.StdErr [Timer-Driven Process Thread-10] INFO org.apache.nifi.processors.hadoop.GetHDFS - GetHDFS[id=b0d21ab8-1001-1159-15dd-4d380d420cab] Kerber
os ticket age exceeds threshold [14400 seconds] attempting to renew ticket for user nifitest/dcdrlhadoop1a.mdanderson.edu@MDANDERSON.EDU
2017-06-30 08:18:00,057 ERROR [NiFi logging handler] org.apache.nifi.StdErr [Timer-Driven Process Thread-10] INFO org.apache.nifi.processors.hadoop.GetHDFS - GetHDFS[id=b0d21ab8-1001-1159-15dd-4d380d420cab] Kerber
os relogin successful or ticket still valid
2017-06-30 08:18:00,154 ERROR [NiFi logging handler] org.apache.nifi.StdErr [Timer-Driven Process Thread-6] INFO org.apache.nifi.processors.standard.GetHTTP - GetHTTP[id=19a2140b-1178-102e-de2f-9e978bc6b90a] conte
nt not retrieved because server returned HTTP Status Code 304: Not Modified
2017-06-30 08:18:00,182 ERROR [NiFi logging handler] org.apache.nifi.StdErr [Timer-Driven Process Thread-10] INFO org.apache.nifi.processors.hadoop.GetHDFS - GetHDFS[id=b0d21ab8-1001-1159-15dd-4d380d420cab] Obtain
ed file listing in 181 milliseconds; listing had 0 items, 0 of which were new

the fact that the first run of the processor (after it was scheduled to run) works perfectly (it processes all the files in the directory from the previous day), but not the subsequent runs, makes me suspicious that the 'Directory' property is evaluated once and that the same value is used for each subsequent scheduled run, essentially pointing to the same directory during each run; the log says - "Obtained file listing in 181 milliseconds; listing had 0 items, 0 of which were new", that's what makes me think it's pointing to the same directory as the first run's.

I was expecting the processor to evaluate the 'Directory' property for each scheduled run; does it do that ? if not, how do I make this work? Since Get* processors do not accept any inbound connections, I'm not able to calculate/evaluate the 'Directory' property first in a UpdateAttribute property and pass the correct value to GetHDFS.

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This is a known issue with GetHDFS - https://issues.apache.org/jira/browse/NIFI-2956, which is resolved in NiFi 1.1.0

View solution in original post

2 REPLIES 2

avatar
Expert Contributor

Once I stop and start the GetHDFS processor, it appears the expression for 'Directory' is getting re-evaluated and it is then correctly pointing to previous day's directory and processes the files from that directory. This behavior further confirms that the expression is getting evaluated only for the first scheduled run and not for all subsequent runs; so, is there a work around to force the expression to evaluate for each run ?

avatar
Expert Contributor

This is a known issue with GetHDFS - https://issues.apache.org/jira/browse/NIFI-2956, which is resolved in NiFi 1.1.0