Created on 05-14-2025 04:59 AM - edited 05-14-2025 05:17 AM
Hey, everyone!
Could you help me with next problem, please? I have a test NiFi instance, where I turned on Content Storage archive. It was works good, but one time I decided to increase value of "nifi.content.repository.archive.max.usage.percentage" from default 50% to 70%.
So, NiFi utilized space to 70% of total as I expected.
But, after that I disabled archiving and expected that NiFi releases all data used by archive, but it doesn't happens.
Why so? I've saw messages, that archived data is never cleanup if "nifi.content.repository.archive.enabled" set to "false" after it has been "true". Is that truth?
My current settings:
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=1 MB
nifi.content.repository.directory.repo0=/mnt/nifi/repos/content_repository
nifi.content.repository.archive.max.retention.period=6 hours
nifi.content.repository.archive.max.usage.percentage=60%
nifi.content.repository.archive.backpressure.percentage=70%
nifi.content.repository.archive.enabled=false
nifi.content.repository.always.sync=false
Created 05-14-2025 10:29 AM
@asand3r
Changing following to false turns off archiving.
nifi.content.repository.archive.enabled
NiFi does not clean-up files left in these directories once archive is disabled. Since archive is disabled the archive code that would scan these directories to remove old archive data is not longer executing.
You'll need to manually purge the archived content claims from the archive sub-directories after disabling content_repository archiving.
So your two nodes that still have archive data had that data still present at shutdown while the others did not have archive data after shutdown.
Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 05-14-2025 05:52 AM
@asand3r
Need some more detail to provide a good answer here...
Keep in mind that disabling archive will not prevent content_repository from filling the disk where it resides to 100%. Content claims associated to actively queued FlowFiles within your dataflows on the NiFi canvas will still exist in the content_repository.
Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 05-14-2025 06:09 AM
@MattWho thanks for your answer.
1. It's Apache NiFi 1.18.0
2. Yeap, NiFi was restarted; and I usually restart it after any changes in nifi.properties was made.
3. Hmm, I have confused here. I'm newbie and before I asked my quetion thought that NiFi do not move files somewhere else. But now I see 'archive' directories in content repo. Now, I has three-node cluster with disabled archive (after it was enabled earlier) -- one node has no any files inside 'archive' directories, but other tho has.
4. Sorry, it's from private chat with my colleagues. 😃
So, basically, if I set "nifi.content.repository.archive.enabled" to "false" and restart NiFi service, it must delete all earlier archived data? I was disable it about 4 hours ago, but two nodes still has files inside "*/archive/*" directories.
[user@nifi-host content_repository]$ pwd
/mnt/nifi/repos/content_repository
[user@nifi-host content_repository]$ find . -path "*/archive/*" | wc -l
3955
Created 05-14-2025 10:29 AM
@asand3r
Changing following to false turns off archiving.
nifi.content.repository.archive.enabled
NiFi does not clean-up files left in these directories once archive is disabled. Since archive is disabled the archive code that would scan these directories to remove old archive data is not longer executing.
You'll need to manually purge the archived content claims from the archive sub-directories after disabling content_repository archiving.
So your two nodes that still have archive data had that data still present at shutdown while the others did not have archive data after shutdown.
Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 05-14-2025 12:07 PM
@MattWho thanks so much.
Is it ok, if I simply remove archived data while NiFi is running? Or I must stop a node before delete?
find /mnt/nifi/repos/content_repository -path "*/archive/*" -exec rm -f {} \;
Created 05-14-2025 12:43 PM
@asand3r
With Archive disabled, NIFi is no longer tracking the files left in the archive sub-directories. You can remove those files while NiFi is running. Just make sure you don't touch the active content_repository claims.
Matt