Support Questions
Find answers, ask questions, and share your expertise

How to increase Nifi RPG throughput ?

New Contributor

92583-screen-shot-2018-09-27-at-114916-am.png

Hi,

I am a new user of nifi trying to integrate it for one of our use case. I have set up 3 node nifi cluster (26 cores, 128GB ram). I am stress testing the flow I have and not able to achieve the throughput I require. I am following the best practices articles. But with the simple flow I have I am not able to scale up more than 7Mbps transfer. Can anyone suggest me how to increase the throughput ?

Flow:

[GenerateFlowfile (Primary node only) -> updateAttributes -> processGroupPort] -> OuputPutPort (1kb messagesin chunks of 10) [RPG] -> [inputPortProcessGroup -> UpdateAttributes -> drop]

Flowfiles are stuck transferring between processGroupPort -> OutputPort. I have tried with various combination of backpressure, number of threads and batch size in RPG. The maximum I could achive was 8Mbps. I have seen various usecases where users have achived throughput much more than this.

Few settings I have changed in nifi.conf

nifi.queue.swap.threshold=120000

backpressure threshold = 10000

java heap = 20GB Maximum

Timer Driven Thread Count in controller settings : 500

Can you guys please help me configure the flow for optimum performance ?

Thanks Ashwin

1 REPLY 1

Master Guru

@ashwin konale

I would suggest you flip your design for your Site-to-Site dataflow.

Instead of using a "pull" design:

"<dataflow>" --> "Remote output port" . --> "Remote Process Group (RPG)" --> "<dataflow>"

-

Switch to

"<dataflow>" --> "Remote Process Group (RPG)" ---> "Remote Input Port" --> "<dataflow>"
-
Every Node in a NiFi cluster is running its own copy of the flow. That means that an RPG on one node which is pulling data from a remote port on another system has no idea how many other nodes may be doing the same. Each node only knows via collected S2S details that the target consists of x num nodes. So there is no distributed pooling strategy amongst all the nodes.

-

With a push model RPG --> "Remote input port". The sending nodes know how many nodes in target cluster and can construct a better load distribution strategy.

-

Also take a look at following article to additionally tune your RPG:

https://community.hortonworks.com/articles/109629/how-to-achieve-better-load-balancing-using-nifis-s...

-

Thank you,

Matt

-

If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.