Hello. Yesterday I noticed the mechanism of purging old files in content repository had not worked off. It was configured 12 hours, 20% of disk space, but it took about 80% of content repository, and files were of 14 days old. Any ideas why could this happen? Btw after restart of nifi purging the content repo worked and everything was ok. Who have faced such a problem?
@Henry Peterson What you are describing is normal behavior for the repository. You can find a much deeper dive into the details in the following article:
If this answer is helpful, please choose Accept
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Steven @ DFHZ
I read this article twice but have no idea by what cause it could happen. Will you be kindly to explain? Btw I'm dealing with files of 100 MB which are splited into many small files.
Read the section on content archiving. If you have flow files in a content repo they are stored in a content claim (file). Claims can store more than 1 flowfile. Claims don't get unallocated or archived until all the flowfiles in the claim have cleared Nifi. Check for flow files in dataflow deadends and in connections handling errors.
From the article:
A content claim cannot be moved into the content repository archive until none of the pieces of content in that claim are tied to a FlowFile that is active anywhere within any dataflow on the NiFi canvas. What this means is that the reported cumulative size of all the FlowFiles in your dataflows will likely never match the actual disk usage in your content repository.