Support Questions

Find answers, ask questions, and share your expertise

Nifi Cluster setup for NIFI 1.1.2

avatar
Contributor

Hello, I am trying to setup cluster using 3 nodes. So, I have following query regarding cluster:

Is nifi cluster provide facility of load balacing, as in suppose I have 3 nodes in my cluster setup works on same data flow and if because of some reason one node is down then what would happen with remaining nodes? Are remaining nodes work as usual or wait for first node to resume..

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Gaurav Jain

A NiFi cluster consists of the following core capabilities:

1. Cluster Coordinator - One node in a Nifi cluster is elected through zookeeper to be the cluster coordinator. Once an election is complete, all other nodes in the cluster will directly send health and status heartbeats directly to this cluster coordinator. If the currently elected cluster coordinator should stop heartbeating to zookeeper, a new election is held to elect one of the other nodes as the new cluster coordinator.

2. Each Node in NiFi cluster runs independent of each other. They run their own copy of the flow.xml.gz, have their own repo, work on their own FlowFiles. A node that becomes disconnected from the cluster (failed to send heartbeat, network issues between nodes, etc..) will continue to runs its dataflow. If it disconnected due to heartbeat, it will reconnect upon next successful heartbeat.

3. Primary Node - Every Cluster will elect one of its nodes as the primary node. The role of the primary node is run any processor that has been scheduled to run on "primary node only". The intent of the scheduling strategy is to help with processor protocols that are not cluster friendly. For example GetSFTP, ListSFTP, GetFTP, etc... Since every node in a cluster runs the same dataflow, you don't want these competing protocols fighting for the same files on every node. If the node that is currently elected as your primary node becomes disconnected from your cluster, it will stop running any processors configured as "primary node only". The cluster will also elect a new primary node and that new node will start running the "primary node only" configured processors at that time.

4. When a cluster has a disconnected node, any changes to the dataflows will not be allowed. This prevents the flow.xml.gz from becoming unmatched between all cluster nodes. The disconnected node must be rejoined to cluster or dropped completely from the cluster before the editing capability is restored.

Thanks,

Matt

View solution in original post

11 REPLIES 11

avatar
Master Mentor

@Gaurav Jain

A NiFi cluster consists of the following core capabilities:

1. Cluster Coordinator - One node in a Nifi cluster is elected through zookeeper to be the cluster coordinator. Once an election is complete, all other nodes in the cluster will directly send health and status heartbeats directly to this cluster coordinator. If the currently elected cluster coordinator should stop heartbeating to zookeeper, a new election is held to elect one of the other nodes as the new cluster coordinator.

2. Each Node in NiFi cluster runs independent of each other. They run their own copy of the flow.xml.gz, have their own repo, work on their own FlowFiles. A node that becomes disconnected from the cluster (failed to send heartbeat, network issues between nodes, etc..) will continue to runs its dataflow. If it disconnected due to heartbeat, it will reconnect upon next successful heartbeat.

3. Primary Node - Every Cluster will elect one of its nodes as the primary node. The role of the primary node is run any processor that has been scheduled to run on "primary node only". The intent of the scheduling strategy is to help with processor protocols that are not cluster friendly. For example GetSFTP, ListSFTP, GetFTP, etc... Since every node in a cluster runs the same dataflow, you don't want these competing protocols fighting for the same files on every node. If the node that is currently elected as your primary node becomes disconnected from your cluster, it will stop running any processors configured as "primary node only". The cluster will also elect a new primary node and that new node will start running the "primary node only" configured processors at that time.

4. When a cluster has a disconnected node, any changes to the dataflows will not be allowed. This prevents the flow.xml.gz from becoming unmatched between all cluster nodes. The disconnected node must be rejoined to cluster or dropped completely from the cluster before the editing capability is restored.

Thanks,

Matt

avatar
Contributor

It means that if cluster has a disconnected node, then other node will not do anything to flow.xml.gz until the node is again connected in cluster??

avatar
Master Mentor
@Gaurav Jain

The flow.xml.gz file contains everything (Processors, connections, controller services, etc. that make up your dataflow(s) on your canvas. If you try to make a change to a dataflow which has a disconnected node, you will get a response from Nifi that says changes are not allowed while a node is disconnected.

You can take manual steps to delete the disconnected node from the cluster via the cluster UI. This will return control, but the node you deleted will not be able to rejoin cluster later (because flows will not match) without doing additional manual steps.

Matt

avatar
Master Mentor

@Gaurav Jain

When you find an answer that address your question, please accept that answer to benefit other who come to this forum for help.

Thank you, Matt

avatar
Contributor

Thanks Matt

avatar
Contributor

Hi Matt,

As in cluster, each node having its own flow.xml.gz and repo, and suppose if one node having 100 flowfiles to process.

Can this node transfer flowfiles to another node in cluster?

And if this node is not transfering flowfile to other node, then what is the use of clustering?

avatar
Master Mentor

@Gaurav Jain

You can build into your dataflow the ability to redistribute FlowFiles between your nodes.

Below are just some of the benefits NiFi clustering provides:

1. Redundancy - You don't have a single point of failure. You dataflows will still run even if a node is down or temporarily disconnected form your cluster.

2. Scaleable - You can scale out the size of your cluster to add additional nodes at any time.

3. Ease of Management - Often times a dataflow or multipole dataflows are constructed within the NiFi canvas. The volume of data may eventually push the limits of your hardware necessitating the need for additional hardware to support the processing load. You could stand up another standalone Nifi instance running the same dataflow, but then you have two separate dataflows/canvases you need to manage. Clustering allows you to make a change in only one UI and have those changes synced across multiple servers.

4. Provides Site-To-Site for load-balanced data delivery between NiFi end-points.

As you design your dataflows, you must take in to consideration how the data will be ingested.

- Are you running a listener of some sort on every node? In that case source systems you push data to your cluster through some external load-balancer.

- Are you pulling data in to your cluster? Are you using a cluster friendly source like JMS or Kafka wheer multiple NiFi nodes can pull data at the same time? Are you using non-cluster friendly protocols to pull data like SFTP or FTP? (In case like this load-balancing should be handled through list<protocol> --> RPG Input port --> Fetch<protocol> model)

NiFi has data HA on its future roadmap which will allow other nodes to pickup work on data of a down node. Even when this is complete, I do not believe it will doing any behinds the scenes data redistribution.

Thanks,

Matt

avatar
Contributor

But in my case, if any node is disconnected then complete workflow not working