Support Questions

Find answers, ask questions, and share your expertise

Issue in nifi clustering

avatar
Rising Star

i am trying to setup 3 different nifi instance on 3 node hadoop cluster.one on each machine, but getting below error -

6301-capture.png

i am getting this error on node2. All three nifi instances are up but they are not connected. i didn't see anything in NCM graph in connected node tag.

i have a feeling that i am missing some important property.

Although when i tried setting these 3 instance on same machine than it worked fine with same set of config.

Can anyone please help. Thanks in advance.

Ankit @mclark @PJ Moutrie @Pierre Villard

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Ankit Jain

When A NiFi instance is designated as a node its starts sending out heartbeat messages after it is started. Those heartbeat messages contain important connection information for the Node. Part of that messages is the hostname for each connecting node. If left blank Java will try to determine the hostname and in many cases the hostname ends up being "localhost". This may explain why the same configs worked when all instances where on the same machine. Make sure that all of the following properties have been set on everyone of your Nodes:

# Site to Site properties
nifi.remote.input.socket.host=   <-- Set to the FQDN for the Node musty be resolvable by all other instances.
nifi.remote.input.socket.port=   <-- Set to unused port on Node.

# web properties # 
nifi.web.http.host=   <-- set to resolvable FQDN for Node
nifi.web.http.port=   <-- Set to unused port on Node  

# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=         <-- set to resolvable FQDN for Node
nifi.cluster.node.protocol.port=   <-- Set to unused port on Node
nifi.cluster.node.protocol.threads=2
# if multicast is not used, nifi.cluster.node.unicast.xxx must have same values as nifi.cluster.manager.xxx #
nifi.cluster.node.unicast.manager.address=          <-- Set to the resolvable FQDN of your NCM
nifi.cluster.node.unicast.manager.protocol.port=    <-- must be set to Manager protocol port assigned on your NCM.

Your NCM will need to be configured the same way as above for the Site-to-Site properties and Web properties, but instead of the "Cluster Node properties", you will need to fill out the "cluster manager properties":

# cluster manager properties (only configure for cluster manager) #
nifi.cluster.is.manager=true    
nifi.cluster.manager.address=         <-- set to resolvable FQDN for NCMnifi.cluster.manager.protocol.port=   <-- Set to unused port on NCM.

The most likely cause of your issue is not having the host/address fields populated or trying to use a port that is already in use on the server. If setting the above does not resolve your issue, try setting DEBUG for the cluster logging in the logback.xml on one of your nodes and the NCM to get more details:

<logger name="org.apache.nifi.cluster" level="DEBUG"/>

View solution in original post

3 REPLIES 3

avatar
Master Mentor

@Ankit Jain

When A NiFi instance is designated as a node its starts sending out heartbeat messages after it is started. Those heartbeat messages contain important connection information for the Node. Part of that messages is the hostname for each connecting node. If left blank Java will try to determine the hostname and in many cases the hostname ends up being "localhost". This may explain why the same configs worked when all instances where on the same machine. Make sure that all of the following properties have been set on everyone of your Nodes:

# Site to Site properties
nifi.remote.input.socket.host=   <-- Set to the FQDN for the Node musty be resolvable by all other instances.
nifi.remote.input.socket.port=   <-- Set to unused port on Node.

# web properties # 
nifi.web.http.host=   <-- set to resolvable FQDN for Node
nifi.web.http.port=   <-- Set to unused port on Node  

# cluster node properties (only configure for cluster nodes) #
nifi.cluster.is.node=true
nifi.cluster.node.address=         <-- set to resolvable FQDN for Node
nifi.cluster.node.protocol.port=   <-- Set to unused port on Node
nifi.cluster.node.protocol.threads=2
# if multicast is not used, nifi.cluster.node.unicast.xxx must have same values as nifi.cluster.manager.xxx #
nifi.cluster.node.unicast.manager.address=          <-- Set to the resolvable FQDN of your NCM
nifi.cluster.node.unicast.manager.protocol.port=    <-- must be set to Manager protocol port assigned on your NCM.

Your NCM will need to be configured the same way as above for the Site-to-Site properties and Web properties, but instead of the "Cluster Node properties", you will need to fill out the "cluster manager properties":

# cluster manager properties (only configure for cluster manager) #
nifi.cluster.is.manager=true    
nifi.cluster.manager.address=         <-- set to resolvable FQDN for NCMnifi.cluster.manager.protocol.port=   <-- Set to unused port on NCM.

The most likely cause of your issue is not having the host/address fields populated or trying to use a port that is already in use on the server. If setting the above does not resolve your issue, try setting DEBUG for the cluster logging in the logback.xml on one of your nodes and the NCM to get more details:

<logger name="org.apache.nifi.cluster" level="DEBUG"/>

avatar
Rising Star

Thanks @mclark i was not setting these two properties.

  1. # Site to Site properties
  2. nifi.remote.input.socket.host=<--Set to the FQDN for the Node musty be resolvable by all other instances.
  3. nifi.remote.input.socket.port=<--Set to unused port on Node.

Now my nifi cluster is up. Thanks for you help.

avatar

+1 to mclarke's solution.