Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi high jvm heap utilization on primary node

avatar
Rising Star

Hi there,

Maybe you guys can shed a light on an issue we’ve been having since we went live with one of our flows using alot of processors running on the primary node. 

We are using a 2 node cluster where we run a couple of getsqs (time driven, 1min) and executesql (cron scheduling, spread out during the day both on primary node. We do this so only one task is ran resulting in just on trigger flow file. Other wise we get 2 sqs events for the same event and 2 executesqls queries.

What we have been seeing is that the jvm heap utilization on the primary node is above 70%, this increased gradually the past few days from 25%, so didn’t peak straight to 70%. The secondary node on the other hand is on a stable below 10%. So quite clear that there is a correlation between the processors running only on primary and the heap used.

Our questions are:
1. When a primary node processor triggers, will the flow file be processed by the following processors in the flow only on the primary node? Or can this also be done by the secondary? Other processors are running on all nodes.

2. Could executesql on cron cause high heap utilization? Cron’s run spread out during the day and volumes are low. Around 1300 a day.

3. Could getsqs cause high heap utilization? We get a lot of events, where we filter out later in the flow which ones we need and terminate those not needed. 8k events spread out during the day where we only process about 3k of these. We are working on finetuning the sqs events so we only receive the one that really needs to be processed.

Hopefully you can give advise on the challenge we are having.

By the way, when restarting the primary node, the secondary becomes primary and have the same heap issue there.

Thank you in advanced.

Kind regards,

Dave

1 ACCEPTED SOLUTION

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
8 REPLIES 8

avatar
Rising Star

Here’s some extra info:

jvm - high utilization is the primary node

Dave0x1_0-1718363140189.png

core load is normal

Dave0x1_1-1718363183897.png

 

avatar
Rising Star

Correction, the primary node switched itself and the high heap utilization stays on the original node. So, the problem doesn’t seem to stay on the current primary node.

avatar

Hi @Dave0x1 ,

Not sure if this is related but if you are using  releases 2.0.0 M1\M2  and deploying python extensions please see this: https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Release-2-0-M1-amp-M2-High-CPU-Utili...

 

avatar
Rising Star

Hi there @SAMSAL , we are using version 1.24 and no python scripting. We’re using all standaard processors; updateattribute, routeonattribute, logattribute and some specials like, getsqs, executesql, lookupattribute and some groovy script for creating custom metrics.

We’ve just restarted the node with high heap. Now averaging at 10%. We’ll be monitoring the progress this weekend and get back on the results monday. Hopefully the restart helped out.

Thank you for now and have a nice weekend 🙂 

avatar
Super Mentor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

Hi Matt, thank you for the extensive reply. This is a lot to think about. We’ll go thru this monday morning with the team to see if we can get those metrics on our dashboards 🤓 There was a mergecontent temporarily on the canvas to gather analysis data, which might have caused this issue. This template has already been removed yesterday. 

@SAMSAL @MattWho thank you for your replies. I’ll get back to you monday on our findings.

avatar
Rising Star

Update from our side, from the looks of things, stopping and removing the mergecontent used to create csv files has solved the issue regarding the jvm heap. We will watch the MEMORY resource on processors when implementing new stuff.

Thank you all for the great advices and fast replies! 

avatar
Super Mentor

@Dave0x1 
Typically MergeContent processor will utilize a lot of heap when the number of FlowFiles being merged in a single execution is very high and/or the size of the FlowFile's attributes are very large.  While FlowFiles queued in a connection will have the FlowFile attributes/metadata held in NiFi heap, there is a swap threshold at which time NiFi swaps FlowFile attributes to disk.  When it comes to MergeContent, FlowFile are allocated to bins (will still show in inbound connection count).  FlowFiles allocated to bin(s) can not be swapped.  So if you set min/max num flowfiles or min/max size to a large value, it would result in large amounts of heap usage.  

Note: FlowFile content is not held in heap by mergeContent.

So the way to create very large merged files while keeping heap usage lower is by chaining multiple mergeContent processor together in series.  So you merge a batch of FlowFiles in first MergeContent and then merge those into larger merged FlowFile in a second MergeContent.

Also be mindful of extracting content to FlowFile attributes or generating FlowFile attributes with large values to help minimize heap usage.

Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.

Thank you,
Matt