Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi Cluster setup for NIFI 1.1.2

avatar
Contributor

Hello, I am trying to setup cluster using 3 nodes. So, I have following query regarding cluster:

Is nifi cluster provide facility of load balacing, as in suppose I have 3 nodes in my cluster setup works on same data flow and if because of some reason one node is down then what would happen with remaining nodes? Are remaining nodes work as usual or wait for first node to resume..

1 ACCEPTED SOLUTION

avatar
Super Mentor

@Gaurav Jain

A NiFi cluster consists of the following core capabilities:

1. Cluster Coordinator - One node in a Nifi cluster is elected through zookeeper to be the cluster coordinator. Once an election is complete, all other nodes in the cluster will directly send health and status heartbeats directly to this cluster coordinator. If the currently elected cluster coordinator should stop heartbeating to zookeeper, a new election is held to elect one of the other nodes as the new cluster coordinator.

2. Each Node in NiFi cluster runs independent of each other. They run their own copy of the flow.xml.gz, have their own repo, work on their own FlowFiles. A node that becomes disconnected from the cluster (failed to send heartbeat, network issues between nodes, etc..) will continue to runs its dataflow. If it disconnected due to heartbeat, it will reconnect upon next successful heartbeat.

3. Primary Node - Every Cluster will elect one of its nodes as the primary node. The role of the primary node is run any processor that has been scheduled to run on "primary node only". The intent of the scheduling strategy is to help with processor protocols that are not cluster friendly. For example GetSFTP, ListSFTP, GetFTP, etc... Since every node in a cluster runs the same dataflow, you don't want these competing protocols fighting for the same files on every node. If the node that is currently elected as your primary node becomes disconnected from your cluster, it will stop running any processors configured as "primary node only". The cluster will also elect a new primary node and that new node will start running the "primary node only" configured processors at that time.

4. When a cluster has a disconnected node, any changes to the dataflows will not be allowed. This prevents the flow.xml.gz from becoming unmatched between all cluster nodes. The disconnected node must be rejoined to cluster or dropped completely from the cluster before the editing capability is restored.

Thanks,

Matt

View solution in original post

11 REPLIES 11

avatar
Super Mentor

@Gaurav Jain

Please explain what you mean when you say "complete workflow not working".

Screenshots may help if you can provide them.

avatar
Contributor

Can you give me complete tutorial link for setting nifi cluster and load balancing with example.