I keep getting "nifi dataflow is exceeding provenance recording rate slowing down flow to accommodate". I currently have nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.WriteAheadFlowFileRepository set. Would setting it to org.apache.nifi.provenance.VolatileProvenanceRepository have any benefits and what are cons of setting it to volatile, loss of data?
Just to add to the best recommend option made by @Wynner above, the PersistentProvenance implementation is old and known to be slow when dealing with high volumes of events. The WriteAheadProvenance implementation if the much faster replacement. Are you positive current configuration is using WriteAhead? Was NiFi restarted after making the change?
If so, consider increasing these provenance configuration property: nifi.provenance.repository.concurrent.merge.threads
You may also want to look into if disk I/O is an issue. Have you configured the NiFi provenance, content, and flowfile repositories to each have their own dedicated disks?
The VolatileProvenance implementation is held in the NiFi JVM's heap memory space. Setting the number of retained events too high can lead to OOM issues in NiFi. In addition, since it is held in heap, all events are list anytime the NiFi JVM restarts.