Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How does NiFi handle large files still being written to a directory it is monitoring?

Solved Go to solution
Highlighted

How does NiFi handle large files still being written to a directory it is monitoring?

New Contributor

How does NiFi work for files that take secs/10s of secs/mins to be completely written to a directory (i.e. ops system FTPs file to landing directory NiFi is monitoring)? Does it "stream" the file as its written to? Does it some how know the file is done being written to before it "GetFile"s it?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How does NiFi handle large files still being written to a directory it is monitoring?

@Brad Surdick

GetFile does not stream the file as it is being written. If you do not configure the GetFile processor correctly, it will pull the incomplete file multiple times. To prevent this from happening, configure the GetFile processor property Minimum File Age to a value, say 30 seconds. The minimum age that a file must be in order to be pulled; any file younger than this amount of time (according to last modification date) will be ignored.

5870-efbn6.png

1 REPLY 1

Re: How does NiFi handle large files still being written to a directory it is monitoring?

@Brad Surdick

GetFile does not stream the file as it is being written. If you do not configure the GetFile processor correctly, it will pull the incomplete file multiple times. To prevent this from happening, configure the GetFile processor property Minimum File Age to a value, say 30 seconds. The minimum age that a file must be in order to be pulled; any file younger than this amount of time (according to last modification date) will be ignored.

5870-efbn6.png

Don't have an account?
Coming from Hortonworks? Activate your account here