Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to extract NiFi provenance?

avatar

I would like to get all of the NiFi provenance data in my project and store it within a file using a custom NiFi processor, but I cannot find a working solution. Does anyone know how I can get this done? The code should be written in the custom processor's onTrigger method block.

1 ACCEPTED SOLUTION

avatar
Super Mentor

@Alexander Aolaritei

NiFi can produce a lot of provenance data. The solution you are looking for will be coming in Apache NiFi 1.0 in the form of a NiFi reporting Task. This "SiteToSiteProvenanceReportingTask" will use the NiFi Site-to-Site (S2S) protocol to send provenance events to another NiFi instance in configurable batches. Of course that target NIfI instance could be yourself; however, that would just produce even more provenance events locally as you handle those messages. So It may be wise to standup another NiFi instance just for Provenance event handling. Upon receiving those provenance events via a S2S input port, you can use standard NiFi processors to split/merge them, route them, and store them in your desired end point (Whether that is local file(s), external DB, etc...).

I am not a developer so cannot help with the custom solution you are working on, but just want to share what is coming as another viable solution to your needs.

Thanks,

Matt

View solution in original post

5 REPLIES 5

avatar
Super Mentor

@Alexander Aolaritei

NiFi can produce a lot of provenance data. The solution you are looking for will be coming in Apache NiFi 1.0 in the form of a NiFi reporting Task. This "SiteToSiteProvenanceReportingTask" will use the NiFi Site-to-Site (S2S) protocol to send provenance events to another NiFi instance in configurable batches. Of course that target NIfI instance could be yourself; however, that would just produce even more provenance events locally as you handle those messages. So It may be wise to standup another NiFi instance just for Provenance event handling. Upon receiving those provenance events via a S2S input port, you can use standard NiFi processors to split/merge them, route them, and store them in your desired end point (Whether that is local file(s), external DB, etc...).

I am not a developer so cannot help with the custom solution you are working on, but just want to share what is coming as another viable solution to your needs.

Thanks,

Matt

avatar
Contributor
@mclark

can you give us an approximate month when NiFi 1.0 will be available to the community?

Thanks.

avatar
Super Mentor

NiFi 1.0 is deep in to development right now. Expect to see it up for vote in August. NiFi 1.0 has considerable re-work done across the board. (New UI, No more NCM for clustering, etc...) Very exciting stuff.

avatar
Explorer

@philg

Hello,

I would like to log all the data transformation done in my DF processor by processor.

Data provenance and SiteToSiteProvenanceReportingTask seems to be the right items to investigate, and also Nifi REST API ( provenance + provenance events)

but I do not know how to proceed for example how to call the REST API .. ( params are not so clear )

Any help ?

phil

best regards

avatar

Please don't post to old threads which are done, create a new question. I will lock this one now.