Created 10-30-2015 05:16 PM
Hi, I'm trying to understand how content repository, sizing/capping and archiving are related, reading through the Admin Guide: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html
By default the archive feature is disabled.
Created 10-30-2015 05:24 PM
When content in the Content Repository is no longer needed (i.e., no FlowFile references the content), NiFi will do one of two things. If archiving is enabled, it will just mark the content as archived. If archiving is disabled, it will delete the content.
The "nifi.content.repository.archive.max.usage.percentage" property in nifi.properties can be used to control the size of the archive. The default value is 50%. This means that if 50% of the disk where the content repository resides is used up, NiFi will start deleting the oldest content in order to prevent more than 50% usage. Note, this is not the same as indicating that the archive itself can be 50% of the disk space. Rather, it says delete as much of the archived data as needed to stay below 50% disk usage.
The "nifi.content.repository.archive.max.retention.period" property can be used to ensure that data is not archived for more than some time period. For instance, setting this to "2 days" means that any data that is archived for 2 days will be deleted, even if only 1% of the disk space is used up. This is often used for compliance purposes.
Created 10-30-2015 05:24 PM
When content in the Content Repository is no longer needed (i.e., no FlowFile references the content), NiFi will do one of two things. If archiving is enabled, it will just mark the content as archived. If archiving is disabled, it will delete the content.
The "nifi.content.repository.archive.max.usage.percentage" property in nifi.properties can be used to control the size of the archive. The default value is 50%. This means that if 50% of the disk where the content repository resides is used up, NiFi will start deleting the oldest content in order to prevent more than 50% usage. Note, this is not the same as indicating that the archive itself can be 50% of the disk space. Rather, it says delete as much of the archived data as needed to stay below 50% disk usage.
The "nifi.content.repository.archive.max.retention.period" property can be used to ensure that data is not archived for more than some time period. For instance, setting this to "2 days" means that any data that is archived for 2 days will be deleted, even if only 1% of the disk space is used up. This is often used for compliance purposes.
Created 10-30-2015 05:32 PM
To close the loop with some offline discussions here are a few scenarios to help with the understanding:
Created 08-31-2018 07:55 AM
Very old topic, but still valid. Do I understand it right, if we have no provenance enabled, it makes no sense to have the content repo archive enabled? If both is enabled, how can we recover a single flow, via the provenance window?