Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Nifi : Unable to elect cluster coordinator

avatar
Contributor

Hi ,

Nifi is not able to form a cluster after restart , this was working fine for months without any issues, only after restart i am seeing issues , can anyone please help ? i have below zookeeper configuration and nifi.properties and nifi-app.log information. we are using external zookeeper.

 

zoo.cfg

 

initLimit=100
autopurge.purgeInterval=24
autopurge.snapRetainCount=30
syncLimit=50
tickTime=2000
dataDir=/ngs/app/xxx/zookeeper-data-356_1
admin.enableServer=true
admin.serverPort=9990
standaloneEnabled=false
server.1=node1:2887:3887;2184
server.2=node2:2886:3886;2182
server.3=node3:2889:3889;2183

quorum.cnxn.threads.size=20
4lw.commands.whitelist=mntr,stat

 

nifi.properties

 

nifi.state.management.configuration.file=./conf/state-management.xml
# The ID of the local state provider
nifi.state.management.provider.local=local-provider
nifi.state.management.provider.cluster=zk-provider
nifi.state.management.embedded.zookeeper.start=false
nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties


nifi.cluster.is.node=true
nifi.cluster.node.address=node-xxxx
nifi.cluster.node.protocol.port=11443
nifi.cluster.node.protocol.threads=10
nifi.cluster.node.protocol.max.threads=50
nifi.cluster.node.event.history.size=25
nifi.cluster.node.connection.timeout=10 sec
nifi.cluster.node.read.timeout=10 sec
nifi.cluster.node.max.concurrent.requests=100
nifi.cluster.firewall.file=
nifi.cluster.flow.election.max.wait.time=1 mins
nifi.cluster.flow.election.max.candidates=

nifi.zookeeper.connect.string=xxxx:2182,xxxx:2183,xxxx:2184
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/nifi

 

 

Nifi-app.log on node 1:

 

2022-01-09 05:55:56,151 INFO [main] o.a.n.c.repository.FileSystemRepository Initializing FileSystemRepository with 'Always Sync' set to false
2022-01-09 05:55:56,379 INFO [main] o.apache.nifi.controller.FlowController Checking if there is already a Cluster Coordinator Elected...
2022-01-09 05:55:56,521 INFO [main] org.apache.curator.utils.Compatibility Using emulated InjectSessionExpiration
2022-01-09 05:55:56,590 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-01-09 05:55:56,609 INFO [main] org.apache.zookeeper.common.X509Util Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2022-01-09 05:55:56,622 INFO [main] org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-01-09 05:55:56,668 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-01-09 05:55:56,719 INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: CONNECTED
2022-01-09 05:55:56,757 INFO [main-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.2=xxxxxx:2886:3886:participant;0.0.0.0:2182, server.1=xxxxxx:2887:3887:participant;0.0.0.0:2184, server.3=xxxxxx:2889:3889:participant;0.0.0.0:2183, version=0}
2022-01-09 05:55:56,777 INFO [main-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.2=xxxxxx:2886:3886:participant;0.0.0.0:2182, server.1=xxxxxx:2887:3887:participant;0.0.0.0:2184, server.3=xxxxxx:2889:3889:participant;0.0.0.0:2183, version=0}
2022-01-09 05:55:56,790 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting
2022-01-09 05:55:56,913 INFO [main] o.apache.nifi.controller.FlowController It appears that no Cluster Coordinator has been Elected yet. Registering for Cluster Coordinator Role.
2022-01-09 05:55:56,914 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=true] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election.
2022-01-09 05:55:56,917 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-01-09 05:55:56,918 INFO [main] org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-01-09 05:55:56,920 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-01-09 05:55:56,924 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election.
2022-01-09 05:55:56,924 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] started
2022-01-09 05:55:56,924 INFO [main] o.a.n.c.c.h.AbstractHeartbeatMonitor Heartbeat Monitor started
2022-01-09 05:55:56,952 INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: CONNECTED

.....

2022-01-09 05:56:15,810 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow
2022-01-09 05:56:18,732 INFO [main] o.a.nifi.controller.StandardFlowService Connecting Node: rn-ssenpcit-lapp205.rno.apple.com:8086
2022-01-09 05:56:18,771 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again
2022-01-09 05:56:18,771 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered
2022-01-09 05:56:19,787 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again
2022-01-09 05:56:19,787 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered
2022-01-09 05:56:20,798 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again

 

Nifi-app.log on node 2 : 

 

2022-01-09 05:55:40,827 INFO [main] o.apache.nifi.controller.FlowController Checking if there is already a Cluster Coordinator Elected...
2022-01-09 05:55:40,893 INFO [main] org.apache.curator.utils.Compatibility Using emulated InjectSessionExpiration
2022-01-09 05:55:40,938 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-01-09 05:55:40,950 INFO [main] org.apache.zookeeper.common.X509Util Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2022-01-09 05:55:40,957 INFO [main] org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-01-09 05:55:40,985 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-01-09 05:55:41,031 INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: CONNECTED
2022-01-09 05:55:41,057 INFO [main-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.2=rn-ssenpcit-lapp206.rno.apple.com:2886:3886:participant;0.0.0.0:2182, server.1=rn-ssenpcit-lapp206.rno.apple.com:2887:3887:participant;0.0.0.0:2184, server.3=rn-ssenpcit-lapp206.rno.apple.com:2889:3889:participant;0.0.0.0:2183, version=0}
2022-01-09 05:55:41,067 INFO [main-EventThread] o.a.c.framework.imps.EnsembleTracker New config event received: {server.2=rn-ssenpcit-lapp206.rno.apple.com:2886:3886:participant;0.0.0.0:2182, server.1=rn-ssenpcit-lapp206.rno.apple.com:2887:3887:participant;0.0.0.0:2184, server.3=rn-ssenpcit-lapp206.rno.apple.com:2889:3889:participant;0.0.0.0:2183, version=0}
2022-01-09 05:55:41,069 INFO [Curator-Framework-0] o.a.c.f.imps.CuratorFrameworkImpl backgroundOperationsLoop exiting
2022-01-09 05:55:41,186 INFO [main] o.apache.nifi.controller.FlowController It appears that no Cluster Coordinator has been Elected yet. Registering for Cluster Coordinator Role.
2022-01-09 05:55:41,187 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=true] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election.
2022-01-09 05:55:41,191 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Starting
2022-01-09 05:55:41,192 INFO [main] org.apache.zookeeper.ClientCnxnSocket jute.maxbuffer value is 4194304 Bytes
2022-01-09 05:55:41,195 INFO [main] o.a.c.f.imps.CuratorFrameworkImpl Default schema
2022-01-09 05:55:41,201 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Registered new Leader Selector for role Cluster Coordinator; this node is an active participant in the election.
2022-01-09 05:55:41,202 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] started
2022-01-09 05:55:41,202 INFO [main] o.a.n.c.c.h.AbstractHeartbeatMonitor Heartbeat Monitor started
2022-01-09 05:55:41,205 INFO [main-EventThread] o.a.c.f.state.ConnectionStateManager State change: CONNECTED
......
2022-01-09 05:55:57,209 INFO [main] o.apache.nifi.controller.FlowController Successfully synchronized controller with proposed flow
2022-01-09 05:56:00,250 INFO [main] o.a.nifi.controller.StandardFlowService Connecting Node: rn-ssenpcit-lapp206.rno.apple.com:8086
2022-01-09 05:56:00,251 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again
2022-01-09 05:56:00,251 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered
2022-01-09 05:56:01,252 WARN [main] o.a.nifi.controller.StandardFlowService There is currently no Cluster Coordinator. This often happens upon restart of NiFi when running an embedded ZooKeeper. Will register this node to become the active Cluster Coordinator and will attempt to connect to cluster again
2022-01-09 05:56:01,252 INFO [main] o.a.n.c.l.e.CuratorLeaderElectionManager CuratorLeaderElectionManager[stopped=false] Attempted to register Leader Election for role 'Cluster Coordinator' but this role is already registered
2022-01-09 

 

 

8 REPLIES 8

avatar
Super Mentor

@samarsimha 

 

Did you make any recent changes before you restarted your NiFi nodes to things like hostnames or ports?

You could try stopping you NiFi nodes, removing the NiFi local state directory on all nodes, and then restarting NiFi again. 

You can check the state-management.xml configuration file to see where each node is keeping local state.  The default for Apache NiFi is "./state/local".

If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt

avatar
New Contributor

the issue is fixed now, it was zookeeper issue

avatar
New Contributor

@SamarApple, can you describe what was the zookeeper issue? I have the same problem

avatar
New Contributor

I had a zNode ACL issue for zookeeper, once ACL access is fixed it started working fine.

avatar
Contributor

I have same problem.

Please, how did you solve this?

avatar
New Contributor

@rafy, hi! I'm just solved this issue. In centos 7 i just open some ports: nifi.cluster.node.protocol.port and zk ports in zookeeper.properties

avatar
Contributor

i am on CentOs 8.5.

Nifi 1.16.2

I set something like this on nifi.properties file:

nifi.cluster.node.protocol.port=9991.

 

And on zookeeper:

server.1=masternode:2888:3888;2181
server.2=workernode02:2888:3888;2181
server.3=workernode03:2888:3888;2181

avatar
New Contributor

Just add this ports to firewalld