Support Questions

Find answers, ask questions, and share your expertise

ZOOKEEPER CHECK FAILED

avatar

Hello ,

My zookeeper is working good on my 4 nodes. but when i start a checking it failed and i got this error on my stderr:


Traceback (most recent call last):  
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper    
    result = _call(command, **kwargs_copy)  
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call    
    raise ExecutionFailed(err_msg, code, out, err) 
ExecutionFailed: Execution of '/var/lib/ambari-agent/tmp/zkSmoke.sh /usr/hdp/current/zookeeper-client/bin/zkCli.sh ambari-qa /usr/hdp/current/zookeeper-client/conf 2181 False kinit no_keytab no_principal /var/lib/ambari-agent/tmp/zkSmoke.out' returned 4. zk_node1=master.rh.bigdata.cluster 
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zk_smoketest     
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)     
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)     
    at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)     
    at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:708)     
    at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:596)     
    at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:368)     
    at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:328)     
    at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:287)


I also belive that this issue affect other services !

4 REPLIES 4

avatar
Master Mentor

@Adil BAKKOURI

Ambari Simply makes use of the Zookeeper Smoke Test "zkSmoke.sh" script to verify the Zookeeper Connection and if it can create & delete the dummy ZNode "/zk_smoketest" inside the zookeeper host or not?

As you see the following message:

ExecutionFailed: Execution of '/var/lib/ambari-agent/tmp/zkSmoke.sh /usr/hdp/current/zookeeper-client/bin/zkCli.sh ambari-qa /usr/hdp/current/zookeeper-client/conf 2181 False kinit no_keytab no_principal /var/lib/ambari-agent/tmp/zkSmoke.out' returned 4. zk_node1=master.rh.bigdata.cluster 
.
.
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zk_smoketest  


Which means your Zookeeper process might not be running file. Most probably the Zookeeper port might not be accessible on host "master.rh.bigdata.cluster"

So you will need to first check if your Zookeeper is running fine or not on host "master.rh.bigdata.cluster" and if the Zookeeper port 2181 is listening fine or not?

# ps -ef | grep -i zookeeper
# netstat -tnlpa | grep 2181
# service iptables stop


Most Possible Cause:

If you see that Zookeeper process is not running then please try to restart it and check the Zookeper Logs to find any issue/ startup failure and please share the log.

Most probably once your Zookeeper is running fine and the port 2181 is accessible then your Zookeeper Check Should also run fine.

Firewall or Port Blocking issue:

So please go to the host where the SmokeTest is failing and then try to telnet to Zookeeper Host & Port to see if that is accessible?

# telnet  master.rh.bigdata.cluster  2181
(OR)
# nc -v master.rh.bigdata.cluster  2181


Manual Test:

You your self can try running the same command to verify this Zookeeper Zmoke Test run Try running this command on different Zookeeper nodes as it is.

# /var/lib/ambari-agent/tmp/zkSmoke.sh /usr/hdp/current/zookeeper-client/bin/zkCli.sh ambari-qa /usr/hdp/current/zookeeper-client/conf 2181 False kinit no_keytab no_principal /var/lib/ambari-agent/tmp/zkSmoke.out


.

.


avatar

hi @Jay Kumar SenSharma,

thats the output of ps -ef | grep -i zookeeper from master.rh.bigdata.cluster :

root@RHBigData1:~# ps -ef | grep -i zookeeper
root     11171 11148  0 09:58 pts/0    00:00:00 grep --color=auto -i zookeeper
zookeep+ 17902     1  0 Jun13 ?        00:04:15 /usr/jdk64/jdk1.8.0_112/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.log.file=zookeeper-zookeeper-server-RHBigData1.log -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp /usr/hdp/current/zookeeper-server/bin/../build/classes:/usr/hdp/current/zookeeper-server/bin/../build/lib/*.jar:/usr/hdp/current/zookeeper-server/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/slf4j-api-1.6.1.jar:/usr/hdp/current/zookeeper-server/bin/../lib/netty-3.10.5.Final.jar:/usr/hdp/current/zookeeper-server/bin/../lib/log4j-1.2.16.jar:/usr/hdp/current/zookeeper-server/bin/../lib/jline-0.9.94.jar:/usr/hdp/current/zookeeper-server/bin/../zookeeper-3.4.6.3.1.0.0-78.jar:/usr/hdp/current/zookeeper-server/bin/../src/java/lib/*.jar:/usr/hdp/current/zookeeper-server/conf::/usr/share/zookeeper/*:/usr/share/zookeeper/* -Xmx1024m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/hdp/current/zookeeper-server/conf/zoo.cfg


thats the output of root@RHBigData1:~# netstat -tnlpa | grep 2181 from master.rh.bigdata.cluster :

root@RHBigData1:~# netstat -tnlpa | grep 2181
tcp6       0      0 :::2181                 :::*                    LISTEN      17902/java
tcp6       0      0 172.16.138.156:2181     172.16.138.156:59142    TIME_WAIT   -

i cant understand why the service iptable is not loaded :

root@RHBigData1:~# service iptables stop
Failed to stop iptables.service: Unit iptables.service not loaded.


from another host "node4" which is my cluster master :


root@node4:~# telnet  master.rh.bigdata.cluster  2181
Trying 172.16.138.156...
Connected to master.rh.bigdata.cluster.
Escape character is '^]'.


Connection closed by foreign host.
but when i try the manuel test, it still not working ! same error !
root@node4:~# /var/lib/ambari-agent/tmp/zkSmoke.sh /usr/hdp/current/zookeeper-client/bin/zkCli.sh ambari-qa /usr/hdp/current/zookeeper-client/conf 2181 False kinit no_keytab no_principal /var/lib/ambari-agent/tmp/zkSmoke.out
zk_node1=master.rh.bigdata.cluster
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zk_smoketest
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
        at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:708)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:596)
        at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:368)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:328)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:287)
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zk_smoketest
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
        at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:703)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:596)
        at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:368)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:328)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:287)
Running test on host master.rh.bigdata.cluster
Connecting to master.rh.bigdata.cluster:2181
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Welcome to ZooKeeper!
JLine support is enabled
[zk: master.rh.bigdata.cluster:2181(CONNECTING) 0] get /zk_smoketest
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /zk_smoketest
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1184)
        at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:722)
        at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:596)
        at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:368)
        at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:328)
        at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:287)
Connecting to master.rh.bigdata.cluster:2181
log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.



avatar
Master Mentor

@Adil BAKKOURI

Your Zookeeper process command output shows the log name something like following:

root@RHBigData1:~# ps -ef | grep -i zookeeper
.
.
-Dzookeeper.log.file=zookeeper-zookeeper-server-RHBigData1.log


.

I doubt that your Zookeeper Hostname might not be correct.

Usually the zookeeper log file name is generated based on the FQDN of the node. So if your Zookeeper FQDN is correctly set then it should be showing something like following:

# hostname -f
master.rh.bigdata.cluster

.

So please verify of your Zookeeper host has the FQDN setup correctly and then restart the Zookeeper after fixing the FQDN. Later you should see the zookeeper log file name something like "zookeeper-zookeeper-server-master.rh.bigdata.cluster.log"

avatar

Hi @Jay Kumar SenSharma.
I just checked my FQDN and they are all correct :

  • master.rh.bigdata.cluster
  • node2.rh.bigdata.cluster
  • node3.rh.bigdata.cluster
  • node4.rh.bigdata.cluster


Still got the error !