Support Questions

Find answers, ask questions, and share your expertise

Nifi: How to check if large file has completely been written to directory without using Minimum File Age setting?

avatar
Contributor

Is the Minimum File Age setting on a GetFile processor the only way to check if large files have been completely written to its directory?

Instead of using a timer, is there a way to ask for the current filesize attribute after I use GetFile? I was trying to use UpdateAttribtue/RouteOnStrategy processors to repeatedly check the filesize attribute until it's been completed.

Thanks

1 ACCEPTED SOLUTION

avatar
Master Guru

@K Hajjar

Sharing some thoughts you could try. You may have option to use execute process which would fetch size from file and store size to local disk. Run the loop again and check new size against size stored to local disk. If size has not changed you can routeonstragegy thereafter.

View solution in original post

6 REPLIES 6

avatar
Master Guru

@K Hajjar

Sharing some thoughts you could try. You may have option to use execute process which would fetch size from file and store size to local disk. Run the loop again and check new size against size stored to local disk. If size has not changed you can routeonstragegy thereafter.

avatar
Contributor

Awesome, exactly what i'm looking for thank you -- i'll try that

avatar
Contributor
@Sunile Manjee

So the "command" property I would configure to fetch the current filesize of the file using unix commands?

then it would be written and stored to local disk?

Reading the documentation, it seems like it's used to provide path of an executable, but I don't have one to specify

avatar
Master Guru

I would use the executeprocess processor. You would execute for example a python or shell script to do the logic I shared above. make sense?

avatar
Contributor

got it! thanks

avatar
Master Guru

A common way to do this is to have the file written to ".filename" first and renamed to "filename" when done. This is why the GetFile processor File Filter property defaults to:

 [^\\.]\.*  

That regular expression says any filename that doesn't start with a period.

I realize you may not have control over how the files are being written to the directory though, so this may not be an option if you can't control that.