Created 06-05-2024 08:27 AM
Hello,
We have a 3 node cluster where one of the nodes is only 64 cores 128 GB RAM and other two machines are each 128 cores 128 GB RAM. I am using Round Robin load balancing in the flow as Listfile --> LB --> FetchFile, as a result of which load is getting divided (almost) uniformly among all the machines, which is expected behavior. However, the load average on smaller machine is exceeding 64 (this machine is sweating!) whereas the load average on other two machines is almost 50.
So my question is there a way in NiFi to distribute load in such a way that the load on the smaller machine can be maintained to stay less than 64, and give the other two machines some more work to do?
I tried using DistributeLoad processor, but not sure which strategy to use because Round robin would distribute load equally, next available would distribute when the next node is available. However, how do I configure DistributeLoad such that it divides load equally among two bigger machines but lesser load to the smaller machine?
I would really appreciate your suggestions. Thanks!
Created 06-06-2024 08:02 AM
@G_B
NiFi cluster deployments expect that all nodes in the cluster have same hardware specifications. There is no option in NiFi's Load Balanced connections to customize load-balancing based on current CPU load average of some other node. Even doing so would require NiFi nodes to continuously ping all other nodes to get the current load average before sending FlowFiles which would impact performance. The only thing that would result in any form of variation in distribution would be a node receive rate being diminished, but that is out of NiFi's control. Round Robin will skip a node in rotation if the node is unable to receive FlowFiles as fast as another node.
Also keep in mind that a NiFi Cluster elects a node the roles "cluster coordinator" and "primary node". Sometimes both roles get assigned to same node. The assignment of these roles can change at. anytime. The primary node is only node that will schedule "primary node" only processors to execute. So your one node lighter on CPU could also end up assigned this role adding to its CPU load average.
Often CPU load average is not only impacted by volume, but also content size of the FlowFiles. The LB connections also do not take in to account FlowFile content size when distributing FlowFiles.
While your best option here performance wise is to make sure all nodes have same hardware specifications, there are a few less performant options you could try to distribute your data differently.
1. Use Remote Process Group (RPG) which uses Site-To-SIte (S2S) to distribute FlowFiles across your NiFi nodes. Always recommend using RPG to push to a Remote Input port rather then pull from an Remote output port to achieve better load distribution. Issue here is you need to add RPGs and Remote ports everywhere you were previously using LB configured connections.
2. Build a smart data distribution reusable dataflow. You could build a data flow that sorts FlowFiles by their content size ranges, merges bundles via mergeContent using FlowFile Stream, v3 merge format, send bundles based on size ranges to your various nodes via invokeHTTP to listenHTTP, and then unpackContent once received to extract the FlowFile bundle. This mergeContent is going to add addition cpu load.
3. Consider using DistributeLoad (can be configured with weighted distribution allowing you to create three distribution relationships with maybe like 5 FlowFile per relationship 1 and 2, and relationship with only 1 per iteration. This allows you to send 1 to you lower core node for every 5 sent to other two nodes. You would still need to use updateAttribute (set custom target node URL), mergeContent, invokeHttp, ListenHTTP, and unpackContent in this flow.
So if addressing your hardware differences is not option, Number 1 is probably your next best choice.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt
Created 06-05-2024 11:50 AM
@G_B Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our NiFi experts @MattWho @SAMSAL who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 06-06-2024 08:02 AM
@G_B
NiFi cluster deployments expect that all nodes in the cluster have same hardware specifications. There is no option in NiFi's Load Balanced connections to customize load-balancing based on current CPU load average of some other node. Even doing so would require NiFi nodes to continuously ping all other nodes to get the current load average before sending FlowFiles which would impact performance. The only thing that would result in any form of variation in distribution would be a node receive rate being diminished, but that is out of NiFi's control. Round Robin will skip a node in rotation if the node is unable to receive FlowFiles as fast as another node.
Also keep in mind that a NiFi Cluster elects a node the roles "cluster coordinator" and "primary node". Sometimes both roles get assigned to same node. The assignment of these roles can change at. anytime. The primary node is only node that will schedule "primary node" only processors to execute. So your one node lighter on CPU could also end up assigned this role adding to its CPU load average.
Often CPU load average is not only impacted by volume, but also content size of the FlowFiles. The LB connections also do not take in to account FlowFile content size when distributing FlowFiles.
While your best option here performance wise is to make sure all nodes have same hardware specifications, there are a few less performant options you could try to distribute your data differently.
1. Use Remote Process Group (RPG) which uses Site-To-SIte (S2S) to distribute FlowFiles across your NiFi nodes. Always recommend using RPG to push to a Remote Input port rather then pull from an Remote output port to achieve better load distribution. Issue here is you need to add RPGs and Remote ports everywhere you were previously using LB configured connections.
2. Build a smart data distribution reusable dataflow. You could build a data flow that sorts FlowFiles by their content size ranges, merges bundles via mergeContent using FlowFile Stream, v3 merge format, send bundles based on size ranges to your various nodes via invokeHTTP to listenHTTP, and then unpackContent once received to extract the FlowFile bundle. This mergeContent is going to add addition cpu load.
3. Consider using DistributeLoad (can be configured with weighted distribution allowing you to create three distribution relationships with maybe like 5 FlowFile per relationship 1 and 2, and relationship with only 1 per iteration. This allows you to send 1 to you lower core node for every 5 sent to other two nodes. You would still need to use updateAttribute (set custom target node URL), mergeContent, invokeHttp, ListenHTTP, and unpackContent in this flow.
So if addressing your hardware differences is not option, Number 1 is probably your next best choice.
Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped.
Thank you,
Matt