Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Min & Max age filter in ListFile Processor.

Solved Go to solution

Min & Max age filter in ListFile Processor.

New Contributor

Hi Team,

I have a scenario where I need to read a file with older timestamp, after a file with latest timestamp has been processed by the ListFile processor. Below are the details of what I tried,

I am setting the Min & Max age filter in the ListFile processor as below,

Min Age – 300 sec [5 minutes] Max Age – 864000 sec [10 days]

Then, I touch a file in the file system with the latest timestamp as shown below,

-rw-r--r-- 1 userA userB users 0 Jul 9 00:57 a.txt

the file gets picked by the ListFile processor.Then, I touch a file in the file system with older timestamp as shown below,

-rw-r--r-- 1 userA userB users 0 Jul 5 00:00 b.txt

However this file is not getting picked by the processor. My understaning was that files whose modified time is between 5 minutes to 10 days old, should get picked.

Could you please let me know the actual behaviour of the Min/Max filter? Also could you please let know whether the scenario which I have explained above can be achieved?

Thanks & Regards,

R.Rohit

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Min & Max age filter in ListFile Processor.

Hello @Rohit Ravishankar

ListFile uses last modified timestamp of files. It tracks the latest last modified timestamp to pick newly modified files since it ran before. So, if a file is added with older last modified timestamp than the one which ListFile already picked, then the file won't be picked with ListFile logic. There is an existing JIRA to discuss about the behavior [1].

Min/Max filter is used to filter-out files that is too-young (min) or too-old (max) files. Even if a file passed these condition if its last modified timestamp is older than the latest on already listed, it won't be picked.

If your use-case requires processing input files in 'descending last modified timestamp ' order, then I'd recommend using GetFile (keepSourceFile = false) and PriorityAttributePrioritizer [2] combination.

[1] https://issues.apache.org/jira/browse/NIFI-2383

[2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization

I hope this helps.

1 REPLY 1

Re: Min & Max age filter in ListFile Processor.

Hello @Rohit Ravishankar

ListFile uses last modified timestamp of files. It tracks the latest last modified timestamp to pick newly modified files since it ran before. So, if a file is added with older last modified timestamp than the one which ListFile already picked, then the file won't be picked with ListFile logic. There is an existing JIRA to discuss about the behavior [1].

Min/Max filter is used to filter-out files that is too-young (min) or too-old (max) files. Even if a file passed these condition if its last modified timestamp is older than the latest on already listed, it won't be picked.

If your use-case requires processing input files in 'descending last modified timestamp ' order, then I'd recommend using GetFile (keepSourceFile = false) and PriorityAttributePrioritizer [2] combination.

[1] https://issues.apache.org/jira/browse/NIFI-2383

[2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#prioritization

I hope this helps.