@naveen-
I would recommend getting several thread dumps from NiFi when in this situation to see what is causing your current threads to stale. This can be achieved using the <path to NiFi>/bin/nifi.sh script as follows:
-
./nifi.sh dump <name of dump file>
-
Some other things to try:
1. Under heavy volume the default NiFi provenance implementation (org.apache.nifi.provenance.PersistentProvenanceRepository) may not be able to keep up. If NiFi is is waiting on Provenance, all flows will appear to be stalled. Make sure you are instead using the new org.apache.nifi.provenance.WriteAheadProvenanceRepository implementation which was redesigned to be much more performant.
2. Make sure you do not have constant Garbage collection occurring. Even minor/young GC is a stop-the-world event. It is possible that after some time of running and ingesting data, GC gets in to a non stop cycle of trying to free heap memory space.
3. Have you changed the default Max Timer Driven Thread count settings under "controller settings" in the Global menu in upper right corner of UI. Default is only 10.
4. Avoid configuring any of your processors to use the Event Driven scheduling strategy.
-
Thank you,
Matt