Support Questions

Find answers, ask questions, and share your expertise

NiFi CPU utilization without active threads (workflow files)

avatar
New Contributor

Hello,

 

My NiFi flow has two process groups that process data from kafka:

main - ~20 processors that processes 100 json messages per second

support - ~900 processors that processes 1-2 json messages in a day

 

When the main group is running (started) without message processing the CPU load is ~5%. When the support group is additionally started the CPU load is ~20% (without messages). In addition the main group handles messages with 1-2 sec latency if the support group is not running, but latency increases dramatically (x10-20) if i just start the support group.

 

Is it possible to set additional priorities for the main group or somehow configure the nifi scheduler?

 

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@AndreyN 
Each processor by default uses Timer Driven Scheduling strategy and a run Schedule of 0 secs.  This means that each processor is constantly requesting threads from the Max Timer Driven Thread pool and checking for work (work being any FlowFiles on inbound connections or in case of an ingest processor, connecting to that ingest point whether local dir or remote service to check for data).  While generally these checks for work take micro seconds or longer depending on processor, NiFi does have a global setting for yielding the processors when the previous run resulted in no FlowFiles processed/produced.
To prevent excessive latency this back duration by default is very short (10 ms).

To adjust this setting, you can change the following property in the nifi.properties file:
nifi.bored.yield.duration found in Core Properties section of admin guide.

MattWho_0-1673557425425.png

 

Keep in mind that this setting impacts all processors.  So higher you set the more latency that could exist before new work is found after a run that resulted in no work.

You can also selectively adjust the run schedule on select processors. 0 sec run schedule means to run as fast as possible.  So as soon as one thread completes, request another (unless thread results in no work wait bored duration before requesting next thread).  So if you have flows that are always very light and latency is not a concern, you could set those processors to only get scheduled to execute every 1 sec or longer.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt




View solution in original post

1 REPLY 1

avatar
Master Mentor

@AndreyN 
Each processor by default uses Timer Driven Scheduling strategy and a run Schedule of 0 secs.  This means that each processor is constantly requesting threads from the Max Timer Driven Thread pool and checking for work (work being any FlowFiles on inbound connections or in case of an ingest processor, connecting to that ingest point whether local dir or remote service to check for data).  While generally these checks for work take micro seconds or longer depending on processor, NiFi does have a global setting for yielding the processors when the previous run resulted in no FlowFiles processed/produced.
To prevent excessive latency this back duration by default is very short (10 ms).

To adjust this setting, you can change the following property in the nifi.properties file:
nifi.bored.yield.duration found in Core Properties section of admin guide.

MattWho_0-1673557425425.png

 

Keep in mind that this setting impacts all processors.  So higher you set the more latency that could exist before new work is found after a run that resulted in no work.

You can also selectively adjust the run schedule on select processors. 0 sec run schedule means to run as fast as possible.  So as soon as one thread completes, request another (unless thread results in no work wait bored duration before requesting next thread).  So if you have flows that are always very light and latency is not a concern, you could set those processors to only get scheduled to execute every 1 sec or longer.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt