Created 09-16-2025 09:01 AM
Hello
I want to know one thing. Let's say we have a flow having a Handlehttprequest then a ExecuteSql processor and then a HandleHttpResponse for example. Daily we receive say 1000 calls to this flow. So where does flowfiles get stored? I see in provenance many many flowfiles. So do they purge or all get accumulated. I have less space so wanted to know.
Created 09-16-2025 11:41 AM
@AlokKumar
NiFi FlowFiles consist of two parts:
Archived content claims are moved to "archive" subdirectories within the content_repository. Archiving can be disable which means that content claims where claimant count is zero are immediately deleted. A background archive thread monitors archived content claims and deletes them based on archive retention settings in the nifi.properties file. A common misunderstanding is how the "nifi.content.repository.archive.max.usage.percentage". Lets say it is set to 80%. Once this disk where the content_repository resides reaches 80% capacity, archive will start purging archived content claims to attempt to bring disk usage below that 80%. If all archived content claims have been deleted, NiFi will continues to allow new content claims to be created potentially leading to disk being 100% full. For this reason it is VERY important that the content_repository is allocated to its own physical or logical disk.
Understanding-how-NiFi-Content-Repository-Archiving-works
With NiFi provenance you are seeing Provenance event data which includes metadata about the FlowFile, If the content claim referenced by the FlowFile in the provenance event no longer exists on the content_repository (either inside archive subdirectory or outside archive), you'll have no option to replay or view the content. Provenance is written to its own provenance_repository directory and its retention is also configurable in the nifi.properties file.
Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 09-16-2025 11:41 AM
@AlokKumar
NiFi FlowFiles consist of two parts:
Archived content claims are moved to "archive" subdirectories within the content_repository. Archiving can be disable which means that content claims where claimant count is zero are immediately deleted. A background archive thread monitors archived content claims and deletes them based on archive retention settings in the nifi.properties file. A common misunderstanding is how the "nifi.content.repository.archive.max.usage.percentage". Lets say it is set to 80%. Once this disk where the content_repository resides reaches 80% capacity, archive will start purging archived content claims to attempt to bring disk usage below that 80%. If all archived content claims have been deleted, NiFi will continues to allow new content claims to be created potentially leading to disk being 100% full. For this reason it is VERY important that the content_repository is allocated to its own physical or logical disk.
Understanding-how-NiFi-Content-Repository-Archiving-works
With NiFi provenance you are seeing Provenance event data which includes metadata about the FlowFile, If the content claim referenced by the FlowFile in the provenance event no longer exists on the content_repository (either inside archive subdirectory or outside archive), you'll have no option to replay or view the content. Provenance is written to its own provenance_repository directory and its retention is also configurable in the nifi.properties file.
Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 09-26-2025 06:08 AM
@AlokKumar
Did the assistance/information provided in the response(s) to your community question in this thread assist you? Please take a moment to accept the answer that provided the assisting information.
Thank you,
Matt