Support Questions

Find answers, ask questions, and share your expertise

What could cause nifi primary/coordinator node change in a luster

avatar
Explorer

It was noticed today the nifi re-elected the coordinator node although the previous coordinator node stayed connected based on nifi app log. What is the logic for kicking off coordinator reelection?

Thanks,

Mark

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Mark Lin

Zookeeper is responsible for electing both the cluster coordinator and primary node for a nifi cluster.

Reason why ZK may elect a new primary node include:

Current primary node has not heartbeated to ZK:

- possibly because network issues?

- possibly because current Cluster Coordinator and/or Primary node is having issues preventing heartbeat form being sent such as Java garbage collection. As a stop-the-world event heartbeats would not be sent out while GC is running. by the time GC ends, ZK may have already elected a new Cluster coordinator and/or primary node. Node would notified of change next time it did successfully talk to ZK


So you may never see node actually become disconnected from cluster. Node send heartbeats to current elected cluster coordinator. As long as those heartbeats are making it within configured timeouts, nodes will stay connected in cluster.


Thanks you,

Matt

View solution in original post

1 REPLY 1

avatar
Master Mentor
@Mark Lin

Zookeeper is responsible for electing both the cluster coordinator and primary node for a nifi cluster.

Reason why ZK may elect a new primary node include:

Current primary node has not heartbeated to ZK:

- possibly because network issues?

- possibly because current Cluster Coordinator and/or Primary node is having issues preventing heartbeat form being sent such as Java garbage collection. As a stop-the-world event heartbeats would not be sent out while GC is running. by the time GC ends, ZK may have already elected a new Cluster coordinator and/or primary node. Node would notified of change next time it did successfully talk to ZK


So you may never see node actually become disconnected from cluster. Node send heartbeats to current elected cluster coordinator. As long as those heartbeats are making it within configured timeouts, nodes will stay connected in cluster.


Thanks you,

Matt