Support Questions
Find answers, ask questions, and share your expertise

zookeeper-client cannot connect to zookeeper server

hi all

In my HDP cluster, I install 3 zookeeper-servers and zookeeper client on 3 nodes ( master1 , master2 , master3 ) ,.

all nodes are on redhat machine version 7.2

When we run the zookeeper-client from master1 on the zookeeper server on master1 , we get CONNECTING

When we run the zookeeper-client from master1 on the zookeeper server on master2 , we get CONNECTED

When we run the zookeeper-client from master1 on the zookeeper server on master3 , we get CONNECTED

Examples

[root@master1 ~]# /usr/hdp/current/zookeeper-client/bin/zookeeper-client -server master1:2181
Connecting to master1:2181
Welcome to ZooKeeper!
JLine support is enabled
[zk: master1:2181(CONNECTING) 0]        <-- we get CONNECTING instead to get CONNECTED


[root@master1 ~]# /usr/hdp/current/zookeeper-client/bin/zookeeper-client -server master2:2181
Connecting to master2:2181
Welcome to ZooKeeper!
JLine support is enabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: master2:2181(CONNECTED) 0] 


[root@master1 ~]# /usr/hdp/current/zookeeper-client/bin/zookeeper-client -server master3:2181
Connecting to master3:2181
Welcome to ZooKeeper!
JLine support is enabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
[zk: master3:2181(CONNECTED) 0] 

so problem is only on master1 machine , and actually client cant connected to the zookeeper server on machine - master1


What could be the reason for that?


more /etc/zookeeper/2.6.4.0-91/0/zoo.cfg
clientPort=2181
syncLimit=15
autopurge.purgeInterval=24
maxClientCnxns=60
dataDir=/var/hadoop/zookeeper
initLimit=30
tickTime=2000
autopurge.snapRetainCount=30
server.1=master1.sys89.com:2888:3888
server.2=master2.sys89.com:2888:3888
server.3=master3.sys89.com:2888:3888
cat /usr/hdp/current/zookeeper-client/bin/zookeeper-client
#!/bin/bash


export ZOOKEEPER_HOME=/usr/hdp/2.6.4.0-91//zookeeper
export ZOOKEEPER_CONF=${ZOOKEEPER_HOME}/conf
export CLASSPATH=$CLASSPATH:$ZOOKEEPER_CONF:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*
export ZOOCFGDIR=${ZOOCFGDIR:-$ZOOKEEPER_CONF}
env CLASSPATH=$CLASSPATH ${ZOOKEEPER_HOME}/bin/zkCli.sh "$@"


we check the port 2181 and we get ok status

telnet localhost 2181
Trying ::1
Connected to localhost.
Escape character is '^]'.
Michael-Bronson
15 REPLIES 15

Expert Contributor

@Michael Bronson

Please can you check the zookeeper logs (/var/log/zookeeper) of master1.sys89.com. This can happen if there are too many open connections. Check where there were any warning messages stating with “Too many connections from {IP address of master1.sys89.com}”. Using netstat command also you can verify

netstat -no | grep :2181 | wc -l

To fix this issue, kindly clear up all stale connections manually or try increasing the maxClientCnxns setting at /etc/zookeeper/2.6.4.0-91/0/zoo.cfg. From your zoo.cfg file I can see value is maxClientCnxns=60 which is default. You can increase it by adding the maxClientCnxns=4096 and restart respective affected services.

we already do all this steps , if we increase the axClientCnxns=500 or to 5000, then after some time all ports are in used and we get CLOSE_WAIT so increasing it isn't solution , and from the log we not see hint about - why zookper clinet not connect zookper server

Michael-Bronson

by the way , if we restart the zookeper , then open connections are less then ,maxClientCnxns and we still get CONNECTING , this mean zookeper client cant connect to zookeper server

Michael-Bronson

Mentor

@Michael Bronson

Isn't this a duplicate thread for the same problem

https://community.hortonworks.com/questions/232197/too-many-connections-on-zookeper-server.html?chil...

As the problem is only on Master1 can you check that your /etc/hosts entries are correct? and FW is not active on the master1. your zoo.cfg looks correct.
Last resort would be to delete and install the zookeeper service.


@Geoffrey Shelton Okot , yes this is the same problem , but with different aspect , we checked the /etc/hosts , and file is OK , FW is disable

anyway , we can use IP instead localhost as the following:

/usr/hdp/current/zookeeper-client/bin/zookeeper-client -server SOME_IP:2181
Michael-Bronson

Mentor

@Michael Bronson
So if you use the IP does it work if so please check the DNS resolution for this particular host.


no even we use IP its not works

Michael-Bronson

Mentor

@Michael Bronson
Can you share the hosts' entries on the 3 Zk nodes ?

[root@master1 hdp]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

[root@master2 more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6


[root@master3 ~]# more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Michael-Bronson