Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

File Watcher scenario in HDF

Explorer

Can we have a file watcher kind of mechanism in Nifi, where the data flow gets triggered when ever a file shows up at source? Is it same as scheduling a getfile processor or run always?

1 ACCEPTED SOLUTION

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

View solution in original post

4 REPLIES 4

A GetFile processor with the default scheduling (0 secs = run as fast as possible) should handle this. A common scenario that comes up is the idea of picking up only new files that have been placed in a directory, but never removing any of them. This will eventually be accomplished through ListFile and FetchFile processors (open pull-request for ListFile right now). ListFile will maintain state of what was seen on previous executions, and provide FetchFile with the new files to retrieve.

https://issues.apache.org/jira/browse/NIFI-631 for reference for ListFile and FetchFile

You should definitely talk to @nmaillard he is developing a File-Notification Processor that is capable of doing that. I think it gets triggered when new files show up in HDFS (not sure about changes) and you have access to different file attributes.

Explorer

Thanks @bbende and @Jonas Straub

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.