Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

We have planning to use Nifi zero-master clustering.....do we need load balancer in front of the cluster for distribute data ,because already cluster coordinator is there?

 
1 ACCEPTED SOLUTION

Accepted Solutions

Super Guru

NiFi Implemented zero master since 1.0. generally you don't need a LB but for certain use cases it may be required.

View solution in original post

9 REPLIES 9

Super Guru

NiFi Implemented zero master since 1.0. generally you don't need a LB but for certain use cases it may be required.

View solution in original post

@sunile.manjee certain use cases means?

Super Guru

That is a unbounded question. Knowing your use case would provide more value. For example, if you want to expose a single url for NiFi development on the UI, you could use LB with sticky sessions. Again your question is unbouded and better off you share your use case in another HCC question.

Master Guru
@Rakesh S

-

Short answer: The Cluster Coordinator has no role in data distribution within a cluster. Each NiFi node only works on data it specifically receives.
-

You might want a load balancer if

you use any of the listentcp/listenhttp/etc stages.

you want to spread your user interface activity across nodes. (You'll need to pin sessions to nodes for this)

you don't want to hardcode all your node hostnames into RPGs (though there are still issues with this)

A big problem with configuring your load balancer is when a node is disconnected it continues to listen on the nifi webui/rest api port. You will have to write some external healthcheck that authtenticates to nifi and gets the actuall node status.

Master Guru

@David Miller

I don't understand the concern with a "disconnected node" still receiving data from a load-balancer? Just because a node is disconnected does not mean it is not still running its dataflow(s).
If the NiFi service or hosting server is down, their will be no running listener to receive data from the load-balancer, so the LB should failover to a different server.

@Matt Clarke

If I have disconnected a node (say to drain it for maintenance) I don't want my upstream sources to keep posting data to it. S2S ports already have this behavior and close on disconnect.

If a node is disconnected, I don't want webui user sessions to hit that node. At best they will be very confused, and at worst they will make changes to the flow on that node that will cause problems when I reconnect the node.

Master Guru

@David Miller

A Jira was raised to address that concern and the improvement was made in Apache NiFi 1.7.0:
https://issues.apache.org/jira/browse/NIFI-5208

@Matt Clarke Thanks for that info!