Support Questions

Find answers, ask questions, and share your expertise

Hi, am unable to install HDP via Ambari. Zookeeper server is not running throwing the following error in /var/log/zookeeper/zookeeper-root-server-zookeeper-1.out. I am running it in aws env.

avatar
Explorer

/var/log/zookeeper/zookeeper-root-server-zookeeper-1.out:

2016-06-14 06:38:38,357 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007] - Closed socket connection for client /54.206.194.70:49731 (no session established for client) 2016-06-14 06:38:38,664 - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@383] - Cannot open channel to 2 at election address zookeeper-2/54.253.26.67:3888 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:795) 2016-06-14 06:38:38,665 - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@383] - Cannot open channel to 3 at election address zookeeper-3/54.66.23.197:3888 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:795) 2016-06-14 06:38:38,666 - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification time out: 60000 2016-06-14 06:38:40,476 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /54.206.194.70:49734 2016-06-14 06:38:40,477 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running ======================================================================================= Config files: /usr/hdp/current/zookeeper-server/conf/zoo.cfg ZOOKEEPER_PID_DIR=/home/zookeeper/zookeeper_server.pid clientPort=2181 syncLimit=5 autopurge.purgeInterval=24 maxClientCnxns=1000 dataDir=/hadoop/zookeeper initLimit=10 zookeeper.znode.parent=/hbase-unsecure tickTime=2000 autopurge.snapRetainCount=30 server.1=zookeeper-1:2888:3888 server.2=zookeeper-2:2888:3888 server.3=zookeeper-3:2888:3888 ====================================================================== But when am checking for 'netstat -tulpn' am not getting anything running in 2888 or 3888 port. Please help!

1 ACCEPTED SOLUTION

avatar

@Payel Datta, these exceptions indicate that for this member of the ZooKeeper ensemble, it cannot connect to port 3888 for the other 2 members of the ensemble.

2016-06-14 06:38:38,664 - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@383] - Cannot open channel to 2 at election address zookeeper-2/54.253.26.67:3888 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:795)
2016-06-14 06:38:38,665 - WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@383] - Cannot open channel to 3 at election address zookeeper-3/54.66.23.197:3888 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectAll(QuorumCnxManager.java:404) at org.apache.zookeeper.server.quorum.FastLeaderElection.lookForLeader(FastLeaderElection.java:840) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:795)

Port 3888 is used for ZooKeeper's leader election protocol. Without a successful connection, the ensemble cannot successfully elect a leader. Note that the message occurs for the connection to both of the other hosts: zookeeper-2/54.253.26.67 and zookeeper-3/54.66.23.197.

This warning indicates that client connections were rejected, because the ZooKeeper ensemble is not fully initialized. This is expected behavior if an ensemble cannot elect a leader and complete its initialization.

2016-06-14 06:38:40,477 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 

I recommend reviewing ZooKeeper logs from all 3 nodes in the ensemble to try to find root cause. If netstat reports that there is nothing listening on port 3888 on all 3 nodes, then try looking earlier in the logs to see if there was possibly a bind error when ZooKeeper tried to use port 3888. If nothing is easily found in the logs, try restarting all 3 ZooKeeper processes to get a fresh run. That might make it easier to see what is happening when it tries to bind to port 3888.

View solution in original post

12 REPLIES 12

avatar
Explorer

Hi @Chris Nauroth, any help I can get on this?

avatar

@Payel Datta, have you tried using NetCat to test if that can bind and listen on these ports, like I suggested in an earlier comment? "nc -l 2888" or "nc -l 3888". If a similar failure happens with NetCat, then that would confirm that you need to investigate further for some kind of networking problem at the hosts, though I don't know exactly what that networking problem would be from the information here.

avatar
Super Collaborator

I ran into this exact issue and didn't see a resolution here but wanted to update the thread for anyone that comes looking in the future:

I am setting up HDF on an Azure IaaS cluster and had the same issue of Zookeeper unable to bind to the port. In my case I believe it was cloud network configuration that was blocking communication. Switching to using internal IPs for my VMs inside of /etc/hosts for all my nodes (rather than the public IPs I was using before) solved the issue.