Created on 05-22-2018 03:53 PM - edited 08-17-2019 11:37 PM
We have set up the archival process in NiFI with below configuration.
# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository nifi.content.claim.max.appendable.size=10 MB nifi.content.claim.max.flow.files=100 nifi.content.repository.directory.default=/opt/nifi/data/content_repository nifi.content.repository.archive.max.retention.period=12 hours nifi.content.repository.archive.max.usage.percentage=50% nifi.content.repository.archive.enabled=true nifi.content.repository.always.sync=false nifi.content.viewer.url=/nifi-content-viewer/
Please refer attached screenshot there is nothing in archived directory in content repository.
Dilip
Created 05-22-2018 04:02 PM
The fact that every "archive" sub-directory is empty leads me to believe that archive is in fact working correctly. NiFi stores FlowFile content in claims within the content repository. One claim may contain 1 to many Flowfiles. All it takes is one FlowFile to still be active in one of your dataflows (queued in some NiFi connection) to hold up an entire content claim. A content claim cannot be moved to archive unless all active flowfiles referencing that claim are complete (meaning reached a point of termination in your dataflow).
-
The following article explains this in more detail:
https://community.hortonworks.com/articles/82308/understanding-how-nifis-content-repository-archivi....
-
Aside from the above, NiFi opens a lot of file handles. Having insufficient file handles can cause issues with creation of new files. This may affect proper cleanup of both the flowfile and content repositories. I suggest making sure the user that owns your NiFi process has a high number of open file handles available to it.
-
Thanks,
Matt
-
If you found this answer addressed your question, please take moment to login and click "accept" below the answer
Created 05-22-2018 04:02 PM
The fact that every "archive" sub-directory is empty leads me to believe that archive is in fact working correctly. NiFi stores FlowFile content in claims within the content repository. One claim may contain 1 to many Flowfiles. All it takes is one FlowFile to still be active in one of your dataflows (queued in some NiFi connection) to hold up an entire content claim. A content claim cannot be moved to archive unless all active flowfiles referencing that claim are complete (meaning reached a point of termination in your dataflow).
-
The following article explains this in more detail:
https://community.hortonworks.com/articles/82308/understanding-how-nifis-content-repository-archivi....
-
Aside from the above, NiFi opens a lot of file handles. Having insufficient file handles can cause issues with creation of new files. This may affect proper cleanup of both the flowfile and content repositories. I suggest making sure the user that owns your NiFi process has a high number of open file handles available to it.
-
Thanks,
Matt
-
If you found this answer addressed your question, please take moment to login and click "accept" below the answer