Support Questions

Find answers, ask questions, and share your expertise

File Watcher scenario in HDF

avatar
Rising Star

Can we have a file watcher kind of mechanism in Nifi, where the data flow gets triggered when ever a file shows up at source? Is it same as scheduling a getfile processor or run always?

1 ACCEPTED SOLUTION

avatar
Master Guru

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

View solution in original post

4 REPLIES 4

avatar
Master Guru

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

avatar

https://issues.apache.org/jira/browse/NIFI-631 for reference for ListFile and FetchFile

avatar

You should definitely talk to @nmaillard he is developing a File-Notification Processor that is capable of doing that. I think it gets triggered when new files show up in HDFS (not sure about changes) and you have access to different file attributes.

avatar
Rising Star

Thanks @bbende and @Jonas Straub