Support Questions

Find answers, ask questions, and share your expertise

Capacity planning for NiFi cluster

avatar

Is there any guide available for NiFi capacity planning ? What load characteristic should be looked at ?

- Number of concurrent processors running

- Total throughput

Any help is greatly appreciated.

1 ACCEPTED SOLUTION

avatar
Master Mentor
11 REPLIES 11

avatar
Master Mentor

avatar
Master Mentor

avatar
Master Mentor

avatar

Thank You @Artem Ervits for all the links. That was useful information.

I am still struggling with getting sizing for my requirements. I will be moving less than 100Gb of data on a daily basis so volume is not a whole lot. But my data comes in spurts. 20Mbps rate at peak times will suffice. I am wondering if I can run my Nifi client from a VM with 8 Gb of RAM and 200Gb of disk space instead of investing in a server.

avatar
Master Mentor

You're welcome, see a comment above from mclark.

avatar
Master Mentor

Shishir,

I agree that you should be carefully reviewing all the documented links provided by Artem Ervits, but you also need to understand the loading behavior of any given NiFI instance is directly tied to what processors are being used. While some processors exhibit little impact to CPU and/or memory, others can impact those things significantly. Capacity planning needs to take in to consideration the dataflows you want to run. What kind of data content manipulation you want to do (MergeContent, SplitContent, ReplaceContent, etc...), data sizes and volumes, how many NiFi nodes and how you plan to distributed the data load, etc...

avatar

Thank You @mclark for surfacing this. That really makes sense.

avatar
Master Mentor

Here is a basic sizing chart for HDF:

2781-screen-shot-2016-03-14-at-114403-am.png

*** But you must keep in mind that these requirements may grow depending on what processors you use in your dataflow. Memory need is often one that grows quicker then CPU need.

*** Also understand that these sizing scenarios are based upon setting up your NiFi instance(s) per the best practice documentation provided.