Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Zookeeper Quorum Membership Test/Canary

New Contributor

We've recently ran into an issue with ZK failing the Quorum Membership Test along with the canary test, which could be due to any number of reasons, however I'm having trouble finding what CM actually does to perform this test or evidence of it being done.


I'm assuming it's connecting on 2181(or 3181 quorum membership port?) and attempting to query the leader/follower for each member and then attempting to write to the root directory using telnet or zkcli.


Is there a detailed look at what commands are being performed somewhere? I've read the descriptions, but they seem to generalize the steps, something more precise like what is displayed for the start/stop of services would be fantastic.


Expert Contributor

Hi @Smashedcat32 

To give some background on the ZooKeeper Canary, the ServiceMonitor will regularly check the health of the ZooKeeper Service by


1. connecting to the ZooKeeper quorum and locate the leader

2. create a znode

3. read the znode

4 deleting the znode.


If any of these steps fail the ServiceMonitor will report the ZOOKEEPER_CANARY_HEALTH has become bad.


In the health reported above, the reason was "Canary test failed to establish a connection or a client session to the ZooKeeper service", which means it failed on step 1.


The problem could lie in three locations:


1. The ZooKeeper Quorum - Fsync, low GC , Low max client connections

2. The Service Monitor - false reports

3. Network connectivity between the Service Monitor and the ZooKeepers


Now coming to your query regarding canary test commands, i dont think we have it available in docs. You can use the commands from ZK guide to test

Example - To verify if the ZK instance is leader

echo stat | nc ZOOKEEPER_IP ZOOKEEPER PORT | grep Mode