Support Questions

laijucbabu · ‎02-01-2018

Hi team ,

We are using NiFi 1.2 and using 2 nodes for clustering (Lets say NodeA and NodeB). We are not able to understand the how nifi clustering works.

We created processors using URL https://NodeA:9091/nifi/ . If NodeA went down then do we have same processors available if we access using https://NodeB:9091/nifi/ . Or do we need to perform any manual task ?

Regards

Laiju

Wynner · ‎04-16-2018

@laiju cbabu

Here is a link to the documentation covering clustering for that version of NiFi: Clustering Configuration

MattWho · ‎04-23-2018

@laiju cbabu

The most important things to understand about NiFi's cluster architecture is the every node in the cluster runs with its own local copy of the flow.xml.gz (this file contains every configuration any user has made the the NiFi Ui (building flows on canvas, adding reporting tasks, adding controller services, etc...).

-

Because of NiFi's HA control layer, user can login to any node in the an active cluster and make changes within the canvas. The NiFi control layer takes care of making sure those changes are replicated to every node connected to that cluster.

-

Each node also node runs with its own set of repositories (FlowFile, content, provenance and database). Since NiFi does not currently have a HA data layer, should a NiFi node go down the data currently being processed by that node will not be processed until that node is restarted. It is important that the flowfile and content repositories (essential for data integrity) are protected through using RAID disk setups. It is actually easy to standup an entirely new node that uses these same repos and pickup where the old dead node left off. There is no way to merge the contents of two node's repositories together however.

-

Thank you,

Matt

RahulSoni · ‎04-16-2018

@laiju cbabu

You do not need to do anything "special or manual" for NiFi flow to run on the other machine in case of a node failure. NiFi employs a Zero-Master Clustering paradigm. Each node in the cluster performs the same tasks on the data, but each operates on a different set of data.

So if a node fails, the other one has "sufficient information" to keep continuing.

You can have a more in-depth understanding here .

Cloudera Community

Support Questions

how clustering works for nifi

Nifi cluster load balance doesn't work well

NiFi cluster sandbox on Docker

Working with a NiFi DistributedMapCache

NiFi Cluster and Load Balancer

error nifi connecting as cluster

NIFI 2.0 Cluster Set Up

CDEPY: a Python Package to work with Cloudera Data...

Offload NiFi Cluster Nodes using the UI (NiFi 1.8....

External zookeeper and nifi cluster connection iss...

pyspark toPandas() works locally but fails in clus...