Created 11-01-2017 09:15 AM
I am trying to find the amount of time that each processor in my data pipeline takes to process the data. I am trying to find this by either reading the nifi-app.log / rest-api calls. Is there a way to find the duration that each processor took to execute the incoming flow files and volume of data it processed ?
Created on 11-01-2017 11:52 AM - edited 08-18-2019 01:08 AM
Hi @Tanmoy
If you are referring to the lineage duration in NiFi provenance UI (see below pic) then you can use SiteToSiteProvenanceReportingTask to send provenance data to a NiFi cluster. Once NiFi receives it, it can store it it where ever you want (File, index in Solr, Database, etc).
To test it, go to hamburger top-right menu, controller settings, reporting tasks and add a S2SProvenanceRT. Configure it to send data to the same cluster like below:
Notice the Input Port Name attribute. I called it prov so I need to add an input port add to NiFi flow. Then, from this input port I'll decide what to do with provenance data (store it somewhere).
Data you will be receiving looks like this:
{ "eventId": "5e5acd4a-46e5-4bb7-b957-22b89b7c4bb5", "eventOrdinal": 67, "eventType": "ATTRIBUTES_MODIFIED", "timestampMillis": 1509535134626, "timestamp": "2017-11-01T11:18:54.626Z", "durationMillis": -1, "lineageStart": 1509535134620, "componentId": "3829068d-015f-1000-b540-41c80254f8c7", "componentType": "UpdateAttribute", "componentName": "UpdateAttribute", "entityId": "e72bed81-0c26-4dc8-85e2-b4ac7940fcfe", "entityType": "org.apache.nifi.flowfile.FlowFile", "entitySize": 258, "previousEntitySize": 258, "updatedAttributes": { "project_id": "project_1" }, "previousAttributes": { "path": "./", "uuid": "e72bed81-0c26-4dc8-85e2-b4ac7940fcfe", "filename": "721201919566618" }, "actorHostname": "abdelkrjidjmbp2", "contentURI": "http://abdelkrjidjmbp2:8080/nifi-api/provenance-events/67/content/output", "previousContentURI": "http://abdelkrjidjmbp2:8080/nifi-api/provenance-events/67/content/input", "parentIds": [], "childIds": [], "platform": "nifi", "application": "NiFi Flow" },
You can see the name of the processor (UpdateAttribute), its type (UpdateAttribute), its ID (3829068d-015f-1000-b540-41c80254f8c7) and the flow file ID (5e5acd4a-46e5-4bb7-b957-22b89b7c4bb5). To get the Provenance duration you need to do timestampMillis - lineageStart. In my example it's 1509535134626 - 1509535134620 which is 6 ms like in the first screenshot
Created 01-05-2023 09:39 PM
Then what does nifi_average_lineage_duration do?
Created 01-06-2023 01:17 AM
@shekabhi, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
Regards,
Vidya Sargur,