Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

File Watcher scenario in HDF

avatar
Rising Star

Can we have a file watcher kind of mechanism in Nifi, where the data flow gets triggered when ever a file shows up at source? Is it same as scheduling a getfile processor or run always?

1 ACCEPTED SOLUTION

avatar
Master Guru

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

View solution in original post

4 REPLIES 4

avatar
Master Guru

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

avatar

https://issues.apache.org/jira/browse/NIFI-631 for reference for ListFile and FetchFile

avatar

You should definitely talk to @nmaillard he is developing a File-Notification Processor that is capable of doing that. I think it gets triggered when new files show up in HDFS (not sure about changes) and you have access to different file attributes.

avatar
Rising Star

Thanks @bbende and @Jonas Straub