Created 11-01-2017 11:49 PM
Hi guys,
I've been having issues viewing data provenance through the NiFi UI (v1.3, single instance) lately and wondering if anyone could offer some troubleshooting advice? What I've configured/observed:
- No workload running on the NiFi instance other than the flow I'm testing provenance activity against.
- Secure NiFi instance and the user has "query provenance" access policy (internally managed NiFi access, not through ranger). The user has full policy access to all NiFi functionality.
- The Lucene provenance indexes are created as expected with timestamps on the file system aligning with NiFi activity.
- There is no filter applied to the UI provenance search to hide provenance events between a certain date range.
- My nifi.properties file has the below settings (only change other than size, duration, index/query threads is the use of the WriteAheadProvenanceRepository for performance).
# Provenance Repository Properties nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository nifi.provenance.repository.debug.frequency=1_000_000 nifi.provenance.repository.encryption.key.provider.implementation= nifi.provenance.repository.encryption.key.provider.location= nifi.provenance.repository.encryption.key.id= nifi.provenance.repository.encryption.key= # Persistent Provenance Repository Properties nifi.provenance.repository.directory.default=/data/nifi_content/Sandpit/repo/provenance_repository nifi.provenance.repository.max.storage.time=24 hours nifi.provenance.repository.max.storage.size=10 GB nifi.provenance.repository.rollover.time=30 secs nifi.provenance.repository.rollover.size=100 MB nifi.provenance.repository.query.threads=2 nifi.provenance.repository.index.threads=4 nifi.provenance.repository.compress.on.rollover=true nifi.provenance.repository.always.sync=false nifi.provenance.repository.journal.count=16
- Used repository size is well below the configured 10GB and time of viewing provenance is < 24 hours after flow execution (512Mb of provenance vs 10GB configured, provenance checked at 5,10,15,20.... min intervals after flow executed). Attachment "provsize.png".
- If I list the queue of the connection after the processor/s I'm viewing provenance activity against, all flow files are listed and visible as expected.
- The UI displays "no value set" as the begin time of the provenance UI search window. Attachment "provsearch.png"
I'm a little stumped at my missing provenance activity so any community advice would be greatly appreciated.
Thanks
DH.
Created 11-07-2017 07:25 PM
Try stopping Nifi and purging everything within your provenance repository then start Nifi.
Check nifi-app.log file for any provenance related events.
Check if the user running Nifi process has access to read/write in set directory.
I had a similar issue today instead my provenance implementation was set to Volatile, which I changed to WriteAhead.
Also note, by default implementation is PersistentProvenanceRepository and if you have been changing implementations back and forth you will need to delete provenance data. (WriteAhead can read PersistentProvenanceRepository but not other way around).
Created 11-07-2017 07:25 PM
Try stopping Nifi and purging everything within your provenance repository then start Nifi.
Check nifi-app.log file for any provenance related events.
Check if the user running Nifi process has access to read/write in set directory.
I had a similar issue today instead my provenance implementation was set to Volatile, which I changed to WriteAhead.
Also note, by default implementation is PersistentProvenanceRepository and if you have been changing implementations back and forth you will need to delete provenance data. (WriteAhead can read PersistentProvenanceRepository but not other way around).