Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi Perfromnce issues

avatar
Explorer

Hi Team,

 

We are facing Nifi performance issues form a while as we increased no of processors and processor groups. We have increased thread count but still facing the issue. CPU utilization is also not reaching minimum level it is utilizing less than 50%. How to increase the performance in Nifi. We have been using 13.2 version till now and migrated to 18.0 to see if there is performance improvement but still facing the same. Could anyone suggest how to improve performance.

9 REPLIES 9

avatar
Super Mentor

@sarithe 

The NiFi components (Processors, input ports. output ports, funnels, etc) will request a thread from the configured "Max timer Driven thread" pool in NiFi each time the processor is scheduled to execute.  Adding more components will NOT increase the size of the thread pool (which is only set to 10 default out of the box).  Generally speaking, The recommended starting size for your "Max Timer Driven Thread" should be 2 - 4 times the number of cores you have on yoru NiFi server (Example - 16 cores would have thread pool set between 32 and 64).  After adjusting your thread pool, you should monitor CPU utilization on your server while your dataflows are being executed within NiFi.  You may find that your NiFi components execute very fast threads (milliseconds) and cpu load average remains low meaning you could set the thread pool even larger.  Or you find your core load average remains very high meaning you should reduce your thread pool size.   

The configuration for the "Max Timer Driven Thread" can be found from NiFi UI --> Global menu (upper right corner) --> Controller Settings

Note: There is also an "Event Driven Thread Pool" which you should NOT edit.  It is used by processors that have been configured to use the "Event Driven" scheduling strategy (No processors use this by default).  This strategy was experimental and has been deprecated. It willl be removed completely in next major Apache NiFi release.

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Explorer

Hi Matt,

 

Thanks for sharing the insights. We are using 3 node cluster and each cluster having 16cores and we have set timer driven thread count to 160. And seeing core utilization as 8%. when we are getting huge data like 2 GB in one flow, the ingestion is becoming dead slow though increase in threads.

avatar
Super Mentor

@sarithe 
Difficult to provide suggestion without details about your dataflows. Some components (Processors, etc) are going to be CPU intensive and other Memory intensive (embedded docs for each component will show list of resource considerations).  You also need to look at disk I/O for the location where NiFi repositories are located.  The more FlowFiles you have (small files, but a lot of them) will result in high disk I/O.   You can also improve things here by using NiFi record based processors where possible so that one FlowFile contains multiple pieces of content in record format.  You may also look at your concurrent task settings on your processors.  All processors default to 1 concurrent task, but increasing this value is like cloning the processor allowing multiple concurrent executions (each concurrent thread executes on a different FlowFile from the inbound connection(s)).  Also, you have set your Max Timer Driven Thread pool to value that is too high.  

Do you see any out of memory (OOM) errors or "too Many Open Files" errors in the nifi-app.log?  These too will impact performance.  NiFi needs a lot of file handles to work with the many components and FlowFiles being processed.  

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt



avatar
Explorer

Hi Matt,

 

Thanks for the inputs. Below are the Garbage Collection statics for our Environment of Nifi. Could you please check and let us know any tuning that can be done at this area to increase the performance.

 

sarithe_0-1689073686505.png

 

avatar
Super Mentor

@sarithe 

That looks pretty health to me.  GC Young Gen is a normal activity to see as the JVM will not free up memory until heap memory utilization reaches around the 80% point.   Old Gen is full garabage collection and takes a lot longer and is triggered when Young Gen does not clean-up enough heap space.   I see that is at 0. 

Have adjusted the concurrent tasks on the processors where you see a bottleneck or lower then expected throughput, consumption, etc?

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

avatar
Explorer

Yes, whenever we are seeing slowness increasing the concurrent tasks to 6 to 7 but still I am not seeing that much difference, at some time it will be better when increased concurrent tasks. This behaviour is mainly seen for Replace Text processor.

avatar
Super Mentor

@sarithe Mainly seen with ReplavceTEXT?  This processor reads and writes to the content repository.  How is Disk I/O for your content_repository?  How many concurrent tasks configured on this processor?

How large is the content that the replaceText is evaluating?  How is this processor configured?
This processor will also read all or part of that content (depending on configuration) in to heap memory in order to do the text replacement.

Matt

avatar
New Contributor

Right.

avatar
Super Collaborator

NiFi is awesome with so many out-of-the-box processors...however, I have found sometimes a very specialized scripted Groovy processor that fetches 1000 or more files at a time to perform significantly faster... especially if your custom processor consolidates several processors into 1.