Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how clustering works for nifi

avatar
Explorer

Hi team ,

We are using NiFi 1.2 and using 2 nodes for clustering (Lets say NodeA and NodeB). We are not able to understand the how nifi clustering works.

We created processors using URL https://NodeA:9091/nifi/ . If NodeA went down then do we have same processors available if we access using https://NodeB:9091/nifi/ . Or do we need to perform any manual task ?

Regards

Laiju

3 REPLIES 3

avatar
@laiju cbabu

Here is a link to the documentation covering clustering for that version of NiFi: Clustering Configuration

avatar
Super Mentor

@laiju cbabu

The most important things to understand about NiFi's cluster architecture is the every node in the cluster runs with its own local copy of the flow.xml.gz (this file contains every configuration any user has made the the NiFi Ui (building flows on canvas, adding reporting tasks, adding controller services, etc...).

-

Because of NiFi's HA control layer, user can login to any node in the an active cluster and make changes within the canvas. The NiFi control layer takes care of making sure those changes are replicated to every node connected to that cluster.

-

Each node also node runs with its own set of repositories (FlowFile, content, provenance and database). Since NiFi does not currently have a HA data layer, should a NiFi node go down the data currently being processed by that node will not be processed until that node is restarted. It is important that the flowfile and content repositories (essential for data integrity) are protected through using RAID disk setups. It is actually easy to standup an entirely new node that uses these same repos and pickup where the old dead node left off. There is no way to merge the contents of two node's repositories together however.

-

Thank you,

Matt

avatar

@laiju cbabu

You do not need to do anything "special or manual" for NiFi flow to run on the other machine in case of a node failure. NiFi employs a Zero-Master Clustering paradigm. Each node in the cluster performs the same tasks on the data, but each operates on a different set of data.

So if a node fails, the other one has "sufficient information" to keep continuing.

You can have a more in-depth understanding here .