Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi Nodes will not connect to the cluster

avatar
Contributor

Hello @Matt

I am working with Nifi 1.0.0 and am working with 5 nodes, and want them to form a cluster. When I start them up, two or three of them form a cluster with the others forming their own individual clusters. Since Nifi 1.0.0 does not need a cluster manager to form a cluster, I did not elect one. Should I? I have seen it done in older versions but not for 1.0.0. Can you walk me through it?

I have updated the state-management.xml to reflect my zookeeper instances, I have updated nifi.properties for site-to-site properties, cluster properties, etc. I have updated the zookeeper.properties and authorizers.xml file to reflect the hostnames of all five nodes.

Can you help?

michael

9 REPLIES 9

avatar
Master Guru

See: http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.1/bk_dataflow-administration/content/clusterin...

http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.1.1/bk_dataflow-administration/content/state_pro...

Under Cluster Node Properties, set the following:

  • nifi.cluster.is.node - Set this to true.
  • nifi.cluster.node.address - Set this to the fully qualified hostname of the node. If left blank, it defaults to "localhost".
  • nifi.cluster.node.protocol.port - Set this to an open port that is higher than 1024 (anything lower requires root).
  • nifi.cluster.node.protocol.threads - The number of threads that should be used to communicate with other nodes in the cluster. This property defaults to 10, but for large clusters, this value may need to be larger.
  • nifi.zookeeper.connect.string - The Connect String that is needed to connect to Apache ZooKeeper. This is a comma-separted list of hostname:port pairs. For example, localhost:2181,localhost:2182,localhost:2183. This should contain a list of all ZooKeeper instances in the ZooKeeper quorum.
  • nifi.zookeeper.root.node - The root ZNode that should be used in ZooKeeper. ZooKeeper provides a directory-like structure for storing data. Each directory in this structure is referred to as a ZNode. This denotes the root ZNode, or directory, that should be used for storing data. The default value is /root. This is important to set correctly, as which cluster the NiFi instance attempts to join is determined by which ZooKeeper instance it connects to and the ZooKeeper Root Node that is specified.
  • nifi.cluster.flow.election.max.wait.time - Specifies the amount of time to wait before electing a Flow as the "correct" Flow. If the number of Nodes that have voted is equal to the number specified by the nifi.cluster.flow.election.max.candidates property, the cluster will not wait this long. The default is 5 minutes. Note that the time starts as soon as the first vote is cast.
  • nifi.cluster.flow.election.max.candidates - Specifies the number of Nodes required in the cluster to cause early election of Flows. This allows the Nodes in the cluster to avoid having to wait a long time before starting processing if we reach at least this number of nodes in the cluster.

Make sure they are all in the same zookeeper, same network and can talk on all ports to each other

avatar
Contributor

Hi @Timothy Spann

I have set these values that you mentioned, and even updated the zookeeper.properties, state-management.xml , and authorizers.xml files with the appropriate node/port information, but the clustering now goes two sets of two nodes and then one node that does not form a cluster.

Best,

Michael

avatar
Master Guru

see: https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/

The first thing is to configure the list of the ZK (ZooKeeper) instances in the configuration file ‘./conf/zookeep.properties‘. Since our three NiFi instances will run the embedded ZK instance, I just have to complete the file with the following properties:

server.1=node-1:2888:3888 server.2=node-2:2888:3888 server.3=node-3:2888:3888

Then, everything happens in the ‘./conf/nifi.properties‘. First, I specify that NiFi must run an embedded ZK instance, with the following property:

nifi.state.management.embedded.zookeeper.start=true

I also specify the ZK connect string:

nifi.zookeeper.connect.string=node-1:2181,node-2:2181,node-3:2181

As you can notice, the ./conf/zookeeper.properties file has a property named dataDir. By default, this value is set to ./state/zookeeper. If more than one NiFi node is running an embedded ZK, it is important to tell the server which one it is.

avatar
Contributor

@Timothy Spann

I did all of these and configured the state-management.xml, zookeeper.properties, nifi.properties, and authorizers.xml files.

After much experimentation, I got all five nodes to cluster, and here is what I did from a high level.

1. Get two nodes to cluster

2. For every node thereafter, start and restart each node until the node "joins" the cluster

It can be time consuming but it worked.

avatar
Master Guru

that's painful. I will send this discussion to the NIFI committers

avatar
Contributor

avatar
New Contributor

Any update on this query? I have even facing the problem two isolated clusters.

avatar
Expert Contributor
@Michael Silas For clarification, which cluster a node joins is determined by two properties in nifi.properties: nifi.zookeeper.connect.string and nifi.zookeeper.root.node. All of the nodes need to have the same value for these two properties. Also please ensure that you do not copy the 'state' directory from one node to another - one of the state elements is the node ID, and in version 1.0.0 it didn't do a great job of handling the case where two nodes used the same ID - that was fixed in 1.1.0 (in general I'd recommend using 1.1.0 if possible over 1.0.0 because there were several cluster-related issues addressed in 1.1.0).

Additionally, because you are using an embedded ZooKeeper, I would ensure that the conf/zookeeper.properties has the same values on all nodes for the server.1, server.2, ... server.N properties as @Timothy Spann mentioned above, and that all nodes that have the nifi.state.management.embedded.zookeeper.start property of nifi.properties are also mentioned as server.xx (i.e., if all 5 NiFi nodes have nifi.state.management.embedded.zookeeper.start set to true, then you should have server.1, server.2, server.3, server.4, server.5 in your zookeeper.properties file and in your nifi.properties connect string. It's also important to ensure that each node is able to reach all other nodes, as ZooKeeper can become pretty unhappy when one node is unable to communicate with other nodes.

Does this help?

avatar
Contributor

@mpayne, I have five zookeeper instances running and they are configured properly in the zookeeper.properties file. I am not using the embedded zookeepers since I am forming a cluster, and using the embedded zookeepers for each node is not allowed. I never designated a root node since the documentation did not require it. Should I be doing this?

I will look into ensuring that all nodes can communicate with all zookeepers.

Thank you for the advice!