Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to schedule process to fetch only new files from a directory in apache nifi?

avatar
Explorer

Hi,

 

I am looking to fetch only new files added in the directory exactly one time and once file is picked it should not be picked again in apache nifi. I want to schedule this process to to every 3 hours. Please provide solution with screenshot the properties you used to do this process or which processors you are using. I am bit confused between listfile getfile and fetchfile and which properties to used.

 

Any help in this issue will be greatly appreciated.

Thank You!

2 ACCEPTED SOLUTIONS

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
8 REPLIES 8

avatar

Take  a look at the Nifi ListFile & Fetch File processors. They both work together. The ListFile will read files metadata based on the last read file modified date and will keep state of that so that only newly added files will be read. The fetch file will take the filename parameter from the ListFile processor and fetch the contents.

Hope that helps

avatar
Explorer

Hi samsal,

Thanks for the reply can you please share the screen shots i'm bit confused related to which properties to use in Listfile and fetchfile.

avatar

You really dont need a screenshot because you are not changing much properties:

 

1-  Create ListFile Processor & set the "Input Directory" to whatever directory you want to track.

2- Create a FetchFile Processor and connect the ListFile to it via the "success" relationship. under the processor properties keep the "File to Fetch" property set to "${absolute.path}/${filename}" since the path and the file name will be set in those attributes using the ListFile and that is it.

 

After that the content of the file will be passed via the success relation and you can do whatever you want with it just as if you are using GetFile except the ListFile will keep state of the latest file timestamp it grabbed and basically use that to grab any new files added to the folder and update the state to new timestamp and so.

avatar
Explorer

Hi samsal,

Thanks for your help. I have used list file and then fetch file and their is one only file in my directory and I've set Listing strategy in listfile to 'Tracking Timestamps' and when I executed the job it brings the file once only. I am confused will it bring same file only once or whenever I execute the job?

avatar
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

Got it. Thank you

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Explorer

Hi, 

 

Matt thanks for the explanation