Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi : Best practice to backup data provenance

avatar
Explorer

Hi,

We are planning to use NiFi's Data Provenance on long term to use it for audits.

What is the best way to configure NiFi ?

If I "simply" backup the Data Provenance disk content, will I be able to use it later? By reinjecting it in a working NiFi?

Should I use ReportingTaskProcessor? But again how do you query backuped data later? By keeping it in a dedicated NiFi used only for backup?

I also did not understand the management of the FlowFile contents, is it supposed to be stored in the DataProvenance Disk (which would greatly increase its size...)? Or is the "replay button" from Data Provenance UI working only if the content is still fresh and present in the "Content repository"?

Or is Data Provenance just not meant to used for long term purpose?

Sorry if mess up multiple concepts.

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
Master Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Master Mentor

FlowFile content is not stored in provenance repository. The ability to view or replay content will only work if content still exists in content repository. Content repository can be configured to retain archived content. But keep in mind that the content of active FlowFiles still in dataflows will always take priority over archived content. If active data triggers thresholds for disk usage to exceed configured values, all archived content will be purged.

Thanks,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.