Created on 01-28-2018 03:57 PM - edited 08-17-2019 09:20 AM
Using SiteToSiteProvenanceReportingTask to Send Provenance to Apache NiFi for Processing.
Eating our own provenance food!
It's almost comically easy to do this. You set up a task on the server you are reporting on that sends the data to your receiver. That other server you make a simple flow to ingest and process that. I stored it to HBase as JSON as it's a good place to put a lot of data fast.
Send The Data
You need to create a SiteToSiteProvenanceReportingTask in Controller Settings - Reporting Tasks. It's pretty simple. Set the values above with your destination NiFi server and a port name that you have created already.
Receive the Data and Process
An Individual JSON Record
Split the JSON into Records
$.[*]
Save to HBase (PutHBaseJSON)
First I have to create a table.
hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 1.1.2.2.6.2.0-205, r5210d2ed88d7e241646beab51e9ac147a973bdcc, Sat Aug 26 09:33:50 UTC 2017 hbase(main):001:0> create 'PROVENANCE', 'event' 0 row(s) in 2.9900 seconds => Hbase::Table - PROVENANCE scan 'PROVENANCE' ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousAttributes, timestamp=1517159115042, value={"path":"./","filename":"humidity.583225-583284.log","s2s.address":"192.168.1.197:55032","s2s.host":"1 92.168.1.197","mime.type":"text/plain","uuid":"9006a1bb-d755-4272-b8d3-76e666c2a7c6","tailfile.original.path":"/opt/demo/logs/humidity.log"} ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousContentURI, timestamp=1517159115042, value=http://192.168.1.193:8080/nifi-api/provenance-events/61825/content/input ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousEntitySize, timestamp=1517159115042, value=59 ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupId, timestamp=1517159115042, value=01611005-4e82-1491-ae5d-ca64f59491cb ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupName, timestamp=1517159115042, value=Process MiniFi Creator ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestamp, timestamp=1517159115042, value=2018-01-28T00:25:30.616Z ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestampMillis, timestamp=1517159115042, value=1517099130616 ff91e204-05b0-48aa-a666-7942e3f109ab column=event:updatedAttributes, timestamp=1517159115042, value={"RouteOnAttribute.Route":"humidity"} ffde140c-3053-4b9d-89c6-14b68025384d column=event:actorHostname, timestamp=1517159114898, value=192.168.1.193 ffde140c-3053-4b9d-89c6-14b68025384d column=event:application, timestamp=1517159114898, value=NiFi Flow ffde140c-3053-4b9d-89c6-14b68025384d column=event:childIds, timestamp=1517159114898, value=[] ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentId, timestamp=1517159114898, value=3a25cda9-0161-1000-813c-631724a10585 ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentName, timestamp=1517159114898, value=RouteOnAttribute ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentType, timestamp=1517159114898, value=RouteOnAttribute ffde140c-3053-4b9d-89c6-14b68025384d column=event:contentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/output ffde140c-3053-4b9d-89c6-14b68025384d column=event:durationMillis, timestamp=1517159114898, value=-1 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityId, timestamp=1517159114898, value=9b017666-7ce9-45c5-9d0a-2f81e56d6fa8 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entitySize, timestamp=1517159114898, value=16 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityType, timestamp=1517159114898, value=org.apache.nifi.flowfile.FlowFile ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventOrdinal, timestamp=1517159114898, value=61701 ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventType, timestamp=1517159114898, value=ROUTE ffde140c-3053-4b9d-89c6-14b68025384d column=event:lineageStart, timestamp=1517159114898, value=1517084974341 ffde140c-3053-4b9d-89c6-14b68025384d column=event:parentIds, timestamp=1517159114898, value=[] ffde140c-3053-4b9d-89c6-14b68025384d column=event:platform, timestamp=1517159114898, value=nifi ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousAttributes, timestamp=1517159114898, value={"path":"./","filename":"uv.164064-164080.log","s2s.address":"192.168.1.197:55032","s2s.host":"192.168 .1.197","mime.type":"text/plain","uuid":"9b017666-7ce9-45c5-9d0a-2f81e56d6fa8","tailfile.original.path":"/opt/demo/logs/uv.log"} ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousContentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/input ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousEntitySize, timestamp=1517159114898, value=16 ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupId, timestamp=1517159114898, value=01611005-4e82-1491-ae5d-ca64f59491cb ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupName, timestamp=1517159114898, value=Process MiniFi Creator ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestamp, timestamp=1517159114898, value=2018-01-28T00:25:30.607Z ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestampMillis, timestamp=1517159114898, value=1517099130607 ffde140c-3053-4b9d-89c6-14b68025384d column=event:updatedAttributes, timestamp=1517159114898, value={"RouteOnAttribute.Route":"uv"} 1830 row(s) in 11.7680 seconds
Learning to Use HBase