Support Questions

Find answers, ask questions, and share your expertise
Announcements
We’ve updated our product names and community labels - click here for full details

How to backup, store and read Nifi Provence ( lucene ) *.gz files

avatar
Contributor

We were looking at storing the provenance *.gz ( lucene ) files so we could have a record of all activity through Nifi, as our IT security people are fussy about such things.

It seems Nifi writes to multiple files at the same time, so this may not be possible, or perhaps there is a better way of doing this? 

I would welcome thoughts as to how we might do this, or maybe propose a different way to achieve the same logging outcome.

Thanks in advance. 

2 REPLIES 2

avatar
Master Collaborator

Hello @zzzz77

Maybe this blog can help you: https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Arc... 
There explains how to handle the repository archive and it could work for what you need. 

Also, there are other options like the Reporting Tasks documented here: https://nifi.apache.org/docs/nifi-docs/ 
SiteToSiteProvenanceReportingTask is an option for your need. 


Regards,
Andrés Fallas
--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs-up button.

avatar
Master Mentor

@zzzz77 

Provenance can be very noisy depending on size of your dataflows and the amount of FlowFIles being processed through those dataflows.  The provenance repo has age and size configuration that trigger roll-off of old events.   So you may not reach the retention age if you reach size first.  Also would not be trying to read provenance files while they are being written to.   

The SiteToSiteProvenanceReportingTask might be the solution you are looking for in Apache NiFi.   This reporting task will send all provenance events over Site-To-Site protocol to a target NiFi where you can then feed them into any long term storage medium of your choice in a human readable format.

 

Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt