Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to extract NiFi provenance?

Solved Go to solution
Highlighted

How to extract NiFi provenance?

New Contributor

I would like to get all of the NiFi provenance data in my project and store it within a file using a custom NiFi processor, but I cannot find a working solution. Does anyone know how I can get this done? The code should be written in the custom processor's onTrigger method block.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How to extract NiFi provenance?

Master Guru

@Alexander Aolaritei

NiFi can produce a lot of provenance data. The solution you are looking for will be coming in Apache NiFi 1.0 in the form of a NiFi reporting Task. This "SiteToSiteProvenanceReportingTask" will use the NiFi Site-to-Site (S2S) protocol to send provenance events to another NiFi instance in configurable batches. Of course that target NIfI instance could be yourself; however, that would just produce even more provenance events locally as you handle those messages. So It may be wise to standup another NiFi instance just for Provenance event handling. Upon receiving those provenance events via a S2S input port, you can use standard NiFi processors to split/merge them, route them, and store them in your desired end point (Whether that is local file(s), external DB, etc...).

I am not a developer so cannot help with the custom solution you are working on, but just want to share what is coming as another viable solution to your needs.

Thanks,

Matt

View solution in original post

5 REPLIES 5
Highlighted

Re: How to extract NiFi provenance?

Master Guru

@Alexander Aolaritei

NiFi can produce a lot of provenance data. The solution you are looking for will be coming in Apache NiFi 1.0 in the form of a NiFi reporting Task. This "SiteToSiteProvenanceReportingTask" will use the NiFi Site-to-Site (S2S) protocol to send provenance events to another NiFi instance in configurable batches. Of course that target NIfI instance could be yourself; however, that would just produce even more provenance events locally as you handle those messages. So It may be wise to standup another NiFi instance just for Provenance event handling. Upon receiving those provenance events via a S2S input port, you can use standard NiFi processors to split/merge them, route them, and store them in your desired end point (Whether that is local file(s), external DB, etc...).

I am not a developer so cannot help with the custom solution you are working on, but just want to share what is coming as another viable solution to your needs.

Thanks,

Matt

View solution in original post

Highlighted

Re: How to extract NiFi provenance?

@mclark

can you give us an approximate month when NiFi 1.0 will be available to the community?

Thanks.

Highlighted

Re: How to extract NiFi provenance?

Master Guru

NiFi 1.0 is deep in to development right now. Expect to see it up for vote in August. NiFi 1.0 has considerable re-work done across the board. (New UI, No more NCM for clustering, etc...) Very exciting stuff.

Re: How to extract NiFi provenance?

Explorer

@philg

Hello,

I would like to log all the data transformation done in my DF processor by processor.

Data provenance and SiteToSiteProvenanceReportingTask seems to be the right items to investigate, and also Nifi REST API ( provenance + provenance events)

but I do not know how to proceed for example how to call the REST API .. ( params are not so clear )

Any help ?

phil

best regards

Highlighted

Re: How to extract NiFi provenance?

Please don't post to old threads which are done, create a new question. I will lock this one now.

Don't have an account?
Coming from Hortonworks? Activate your account here