I keep getting "nifi dataflow is exceeding provenance recording rate slowing down flow to accommodate". I currently have nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository set. Would setting it to org.apache.nifi.provenance.VolatileProvenanceRepository have any benefits and what are cons of setting it to volatile, loss of data?
Change the following property to the value shown: nifi.provenance.repository.implementation=org.apache.nifi.provenance.WriteAheadProvenanceRepository
This has been available since NiFi 1.2.
Note: If implemented, then comment out this line in the bootstrap.con file, java.arg.13=-XX:+UseG1GC,
there are some know issues which could cause corruption in JVM heap.
Just to add to the best recommend option made by @Wynner above, the PersistentProvenance implementation is old and known to be slow when dealing with high volumes of events. The WriteAheadProvenance implementation if the much faster replacement. Are you positive current configuration is using WriteAhead? Was NiFi restarted after making the change?
If so, consider increasing these provenance configuration property:
You may also want to look into if disk I/O is an issue. Have you configured the NiFi provenance, content, and flowfile repositories to each have their own dedicated disks?
The VolatileProvenance implementation is held in the NiFi JVM's heap memory space. Setting the number of retained events too high can lead to OOM issues in NiFi. In addition, since it is held in heap, all events are list anytime the NiFi JVM restarts.