Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

get flowfile for provenance event



I'm using siteToSiteProvenanceReportingTask. In my flow I can then read provenance event, which references some FlowFile. But how to access that flowFile?




Provenance events are the history about what happened to a flow file, in most cases the flow file has already passed through the flow and is no longer in the system. These events are used to understand what happened to a piece of data, where it was delivered to, where it came from, etc.

Depending how you have your content repository configured, the content for a provenance event may still exist in the content repo. There is a concept of replaying a provenance event, there is a REST end-point if you look here for /provenance-events/replays



So now I need help with what-to-do-next, as I have no idea what is proper way how to do that in nifi.

I have Json and Avro. I want to validate Json using Avro (ValidateRecord processor?) and if it fails, I want to route FlowFile along with description what went wrong to failure relationship.

Currently there is this description emited as provenance file. I have modified code of ValidateRecord, I can use that if there isn't better way. I thought the provenance events is the way, a because of that I closed my pull request.

What is your recommendation?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.