Created on 01-28-2018 03:57 PM - edited 08-17-2019 09:20 AM
Using SiteToSiteProvenanceReportingTask to Send Provenance to Apache NiFi for Processing.
Eating our own provenance food!
It's almost comically easy to do this. You set up a task on the server you are reporting on that sends the data to your receiver. That other server you make a simple flow to ingest and process that. I stored it to HBase as JSON as it's a good place to put a lot of data fast.
Send The Data
You need to create a SiteToSiteProvenanceReportingTask in Controller Settings - Reporting Tasks. It's pretty simple. Set the values above with your destination NiFi server and a port name that you have created already.
Receive the Data and Process
An Individual JSON Record
Split the JSON into Records
$.[*]
Save to HBase (PutHBaseJSON)
First I have to create a table.
hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2.2.6.2.0-205, r5210d2ed88d7e241646beab51e9ac147a973bdcc, Sat Aug 26 09:33:50 UTC 2017
hbase(main):001:0> create 'PROVENANCE', 'event'
0 row(s) in 2.9900 seconds
=> Hbase::Table - PROVENANCE
scan 'PROVENANCE'
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousAttributes, timestamp=1517159115042, value={"path":"./","filename":"humidity.583225-583284.log","s2s.address":"192.168.1.197:55032","s2s.host":"1
92.168.1.197","mime.type":"text/plain","uuid":"9006a1bb-d755-4272-b8d3-76e666c2a7c6","tailfile.original.path":"/opt/demo/logs/humidity.log"}
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousContentURI, timestamp=1517159115042, value=http://192.168.1.193:8080/nifi-api/provenance-events/61825/content/input
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousEntitySize, timestamp=1517159115042, value=59
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupId, timestamp=1517159115042, value=01611005-4e82-1491-ae5d-ca64f59491cb
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupName, timestamp=1517159115042, value=Process MiniFi Creator
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestamp, timestamp=1517159115042, value=2018-01-28T00:25:30.616Z
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestampMillis, timestamp=1517159115042, value=1517099130616
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:updatedAttributes, timestamp=1517159115042, value={"RouteOnAttribute.Route":"humidity"}
ffde140c-3053-4b9d-89c6-14b68025384d column=event:actorHostname, timestamp=1517159114898, value=192.168.1.193
ffde140c-3053-4b9d-89c6-14b68025384d column=event:application, timestamp=1517159114898, value=NiFi Flow
ffde140c-3053-4b9d-89c6-14b68025384d column=event:childIds, timestamp=1517159114898, value=[]
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentId, timestamp=1517159114898, value=3a25cda9-0161-1000-813c-631724a10585
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentName, timestamp=1517159114898, value=RouteOnAttribute
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentType, timestamp=1517159114898, value=RouteOnAttribute
ffde140c-3053-4b9d-89c6-14b68025384d column=event:contentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/output
ffde140c-3053-4b9d-89c6-14b68025384d column=event:durationMillis, timestamp=1517159114898, value=-1
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityId, timestamp=1517159114898, value=9b017666-7ce9-45c5-9d0a-2f81e56d6fa8
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entitySize, timestamp=1517159114898, value=16
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityType, timestamp=1517159114898, value=org.apache.nifi.flowfile.FlowFile
ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventOrdinal, timestamp=1517159114898, value=61701
ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventType, timestamp=1517159114898, value=ROUTE
ffde140c-3053-4b9d-89c6-14b68025384d column=event:lineageStart, timestamp=1517159114898, value=1517084974341
ffde140c-3053-4b9d-89c6-14b68025384d column=event:parentIds, timestamp=1517159114898, value=[]
ffde140c-3053-4b9d-89c6-14b68025384d column=event:platform, timestamp=1517159114898, value=nifi
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousAttributes, timestamp=1517159114898, value={"path":"./","filename":"uv.164064-164080.log","s2s.address":"192.168.1.197:55032","s2s.host":"192.168
.1.197","mime.type":"text/plain","uuid":"9b017666-7ce9-45c5-9d0a-2f81e56d6fa8","tailfile.original.path":"/opt/demo/logs/uv.log"}
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousContentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/input
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousEntitySize, timestamp=1517159114898, value=16
ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupId, timestamp=1517159114898, value=01611005-4e82-1491-ae5d-ca64f59491cb
ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupName, timestamp=1517159114898, value=Process MiniFi Creator
ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestamp, timestamp=1517159114898, value=2018-01-28T00:25:30.607Z
ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestampMillis, timestamp=1517159114898, value=1517099130607
ffde140c-3053-4b9d-89c6-14b68025384d column=event:updatedAttributes, timestamp=1517159114898, value={"RouteOnAttribute.Route":"uv"}
1830 row(s) in 11.7680 seconds
Learning to Use HBase