Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What could cause nifi primary/coordinator node change in a luster

Solved Go to solution
Highlighted

What could cause nifi primary/coordinator node change in a luster

New Contributor

It was noticed today the nifi re-elected the coordinator node although the previous coordinator node stayed connected based on nifi app log. What is the logic for kicking off coordinator reelection?

Thanks,

Mark

1 ACCEPTED SOLUTION

Accepted Solutions

Re: What could cause nifi primary/coordinator node change in a luster

Master Guru
@Mark Lin

Zookeeper is responsible for electing both the cluster coordinator and primary node for a nifi cluster.

Reason why ZK may elect a new primary node include:

Current primary node has not heartbeated to ZK:

- possibly because network issues?

- possibly because current Cluster Coordinator and/or Primary node is having issues preventing heartbeat form being sent such as Java garbage collection. As a stop-the-world event heartbeats would not be sent out while GC is running. by the time GC ends, ZK may have already elected a new Cluster coordinator and/or primary node. Node would notified of change next time it did successfully talk to ZK


So you may never see node actually become disconnected from cluster. Node send heartbeats to current elected cluster coordinator. As long as those heartbeats are making it within configured timeouts, nodes will stay connected in cluster.


Thanks you,

Matt

1 REPLY 1

Re: What could cause nifi primary/coordinator node change in a luster

Master Guru
@Mark Lin

Zookeeper is responsible for electing both the cluster coordinator and primary node for a nifi cluster.

Reason why ZK may elect a new primary node include:

Current primary node has not heartbeated to ZK:

- possibly because network issues?

- possibly because current Cluster Coordinator and/or Primary node is having issues preventing heartbeat form being sent such as Java garbage collection. As a stop-the-world event heartbeats would not be sent out while GC is running. by the time GC ends, ZK may have already elected a new Cluster coordinator and/or primary node. Node would notified of change next time it did successfully talk to ZK


So you may never see node actually become disconnected from cluster. Node send heartbeats to current elected cluster coordinator. As long as those heartbeats are making it within configured timeouts, nodes will stay connected in cluster.


Thanks you,

Matt