Created 07-06-2023 09:48 AM
After successfully running a 3-node NiFi cluster without any issues for the past three years, I am considering setting up an additional cluster. Upon checking the resource utilization of the 3-node ZooKeeper cluster, I noticed that the nodes have very low compute utilization. I would like to know if it is recommended to stick with a setup of 3 NiFi nodes and 3 ZooKeeper nodes, totaling 6 nodes, or if it would be more beneficial to use embedded ZooKeeper.
Created 07-06-2023 12:17 PM
@hackerway
I would not recommend using the embedded zookeeper as zookeeper quorum must be maintained in order for yoru NiFi cluster to remain stable. Losing a NiFi node also takes down the embedded zookeeper running with that NiFi.
While I would recommend keeping your zookeeper quorum installed on its own servers separate from the NiFi service (NiFi can be a resource intensive application depending on dataflow designs and data volumes), you may be able to run yoru external ZK servers and NiFi servers on the same hardware. Sounds like you have established dataflows and resource usage is low on your servers, but that may change if you add to your existing dataflows, add new dataflows, and/or there is a change in data size/volume.
You'll want to be looking at all resource consumption over time before making a decision here:
- CPU Load (core load average as well as peaks)
- Memory utilization on yoru servers to make sure there is sufficient memory to support both ZK and NiFi.
- Disk I/O NiFi can have a lot of disk I/O with it local repositories (content, flowfile, and provenance). So if Disk I/O is high adding ZK to the same host could impact throughput of your NiFi
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.
Thank you,
Matt
Created 07-06-2023 12:17 PM
@hackerway
I would not recommend using the embedded zookeeper as zookeeper quorum must be maintained in order for yoru NiFi cluster to remain stable. Losing a NiFi node also takes down the embedded zookeeper running with that NiFi.
While I would recommend keeping your zookeeper quorum installed on its own servers separate from the NiFi service (NiFi can be a resource intensive application depending on dataflow designs and data volumes), you may be able to run yoru external ZK servers and NiFi servers on the same hardware. Sounds like you have established dataflows and resource usage is low on your servers, but that may change if you add to your existing dataflows, add new dataflows, and/or there is a change in data size/volume.
You'll want to be looking at all resource consumption over time before making a decision here:
- CPU Load (core load average as well as peaks)
- Memory utilization on yoru servers to make sure there is sufficient memory to support both ZK and NiFi.
- Disk I/O NiFi can have a lot of disk I/O with it local repositories (content, flowfile, and provenance). So if Disk I/O is high adding ZK to the same host could impact throughput of your NiFi
If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.
Thank you,
Matt