Created on 07-25-2014 08:16 AM - edited 09-16-2022 02:03 AM
When I issue the command
sudo -u hdfs hdfs zkfc -formatZK
i get the error
14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session 14/07/24 00:24:34 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Opening socket connection to server nn2/192.168.1.31:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Socket connection established to nn2/192.168.1.31:2181, initiating session 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Opening socket connection to server jt1/192.168.1.32:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Socket connection established to jt1/192.168.1.32:2181, initiating session 14/07/24 00:24:35 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server nn2/192.168.1.31:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to nn2/192.168.1.31:2181, initiating session 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Opening socket connection to server jt1/192.168.1.32:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Socket connection established to jt1/192.168.1.32:2181, initiating session 14/07/24 00:24:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Opening socket connection to server nn1/192.168.1.30:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Unable to locate a login configuration) 14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Socket connection established to nn1/192.168.1.30:2181, initiating session 14/07/24 00:24:39 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 14/07/24 00:24:39 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds 14/07/24 00:24:40 INFO zookeeper.ZooKeeper: Session: 0x0 closed 14/07/24 00:24:40 INFO zookeeper.ClientCnxn: EventThread shut down 14/07/24 00:24:40 FATAL ha.ZKFailoverController: Unable to start failover controller. Unable to connect to ZooKeeper quorum at nn1:2181,nn2:2181,jt1:2181. Please check the configured value for ha.zookeeper.quorum and ensure that ZooKeeper is running.
I have confirmed that the zookeeper service is running on every machine by
[root@nn1 ~]# service zookeeper-server start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... already running as process 1065.
I can also do an nc from every machine to every machine
[root@nn1 ~]# nc nn1 2181 ^C [root@nn1 ~]# nc nn2 2181 ^C [root@nn1 ~]# nc jt1 2181 ^C [root@nn1 ~]#
I can see this in the zookeeper event log
2014-07-24 00:24:18,706 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000 2014-07-24 00:24:34,956 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35151 2014-07-24 00:24:34,956 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2014-07-24 00:24:34,956 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35151 (no session established for client) 2014-07-24 00:24:37,075 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35154 2014-07-24 00:24:37,076 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2014-07-24 00:24:37,076 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35154 (no session established for client) 2014-07-24 00:24:39,432 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35157 2014-07-24 00:24:39,433 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running 2014-07-24 00:24:39,433 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35157 (no session established for client) 2014-07-24 00:25:18,709 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1) 2014-07-24 00:25:18,710 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1) 2014-07-24 00:25:18,711 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000 2014-07-24 00:26:18,713 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1) 2014-07-24 00:26:18,715 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1) 2014-07-24 00:26:18,716 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000 2014-07-24 00:26:40,619 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted socket connection from /192.168.1.30:35170 2014-07-24 00:26:43,508 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception EndOfStreamException: Unable to read additional data from client sessionid 0x0, likely client has closed socket at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) at java.lang.Thread.run(Thread.java:662) 2014-07-24 00:26:43,511 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.30:35170 (no session established for client) 2014-07-24 00:27:18,717 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (2, 1) 2014-07-24 00:27:18,719 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@190] - Have smaller server identifier, so dropping the connection: (3, 1) 2014-07-24 00:27:18,719 [myid:1] - INFO [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - Notification time out: 60000
Created 07-29-2014 04:50 AM
Looks like your zookeeper quorum was not able to elect a master. Maybe you have misconfigured your zookeeper?
Make sure that you have entered all 3 servers in your zoo.cfg with a unique ID. Make sure you have the same config on all 3 of your machines and and make sure that every server is using the correct myId as specified in the cfg.
BR
Marc
Created 07-29-2014 04:35 PM
Thank you so much. Your answer is absolutely correct.
I went to each server and did
nn1: service zookeeper-server init --myid=1 --force
nn2: service zookeeper-server init --myid=2 --force
jt1: service zookeeper-server init --myid=3 --force
earlier I had chosen an ID of 1 on every machine.
I also corrected my zoo.cfg. to ensure right entries.
Now it works and I am able to do
sudo -u hdfs hdfs zkfc -formatZK
Thank you so much!
Created 07-29-2014 04:50 AM
Looks like your zookeeper quorum was not able to elect a master. Maybe you have misconfigured your zookeeper?
Make sure that you have entered all 3 servers in your zoo.cfg with a unique ID. Make sure you have the same config on all 3 of your machines and and make sure that every server is using the correct myId as specified in the cfg.
BR
Marc
Created 07-29-2014 04:35 PM
Thank you so much. Your answer is absolutely correct.
I went to each server and did
nn1: service zookeeper-server init --myid=1 --force
nn2: service zookeeper-server init --myid=2 --force
jt1: service zookeeper-server init --myid=3 --force
earlier I had chosen an ID of 1 on every machine.
I also corrected my zoo.cfg. to ensure right entries.
Now it works and I am able to do
sudo -u hdfs hdfs zkfc -formatZK
Thank you so much!