Created 09-18-2017 12:51 PM
I am trying to start zkfc from ambari but it is failing while executing the command: hdfs zkfc -formatZK -nonInteractive . All 3 Zookeeper servers are running state as per Ambari dashboard. When I checked for the port 2181, it is running. But at the same time when i tried with :
telnet abctestlab0512.bdaas.com 2181 Trying x.x.x.x... Connected to abctestlab0512.bdaas.com Escape character is '^]'. Connection closed by foreign host.
It seems like port is getting closed.
Check the following ZKFC logs:
17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:java.library.path=:/usr/hdp/2.4.3.0-227/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.3.0-227/hadoop/lib/native 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-514.16.1.el7.x86_64 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:user.name=root 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:user.home=/root 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Client environment:user.dir=/var/log/hadoop/hdp44-hdfs 17/09/18 07:36:34 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=abctestlab0512.bdaas.com:2181,abctestlab0513.bdaas.com:2181,abctestlab0515.bdaas.com:2181 sessionTimeout=5000 watcher=org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef@610f7aa 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0515.bdaas.com/192.120.10.24:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0515.bdaas.com/192.120.10.24:2181, initiating session 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0513.bdaas.com/192.120.10.22:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0513.bdaas.com/192.120.10.22:2181, initiating session 17/09/18 07:36:34 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:35 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0512.bdaas.com/192.120.10.21:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:35 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0512.bdaas.com/192.120.10.21:2181, initiating session 17/09/18 07:36:35 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:37 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0515.bdaas.com/192.120.10.24:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:37 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0515.bdaas.com/192.120.10.24:2181, initiating session 17/09/18 07:36:37 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0513.bdaas.com/192.120.10.22:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0513.bdaas.com/192.120.10.22:2181, initiating session 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Opening socket connection to server abctestlab0512.bdaas.com/192.120.10.21:2181. Will not attempt to authenticate using SASL (unknown error) 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Socket connection established to abctestlab0512.bdaas.com/192.120.10.21:2181, initiating session 17/09/18 07:36:38 INFO zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect 17/09/18 07:36:39 ERROR ha.ActiveStandbyElector: Connection timed out: couldn't connect to ZooKeeper in 5000 milliseconds 17/09/18 07:36:39 INFO zookeeper.ZooKeeper: Session: 0x0 closed 17/09/18 07:36:39 FATAL ha.ZKFailoverController: Unable to start failover controller. Unable to connect to ZooKeeper quorum at abctestlab0512.bdaas.com:2181,abctestlab0513.bdaas.com:2181,abctestlab0515.bdaas.com:2181. Please check the configured value for ha.zookeeper.quorum and ensure that ZooKeeper is running. 17/09/18 07:36:39 INFO zookeeper.ClientCnxn: EventThread shut down 17/09/18 07:36:39 INFO tools.DFSZKFailoverController: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DFSZKFailoverController at abctestlab0512.bdaas.com/192.120.10.21 ************************************************************/
Created 09-18-2017 06:57 PM
@Ajit Sonawane, can you check your zookeeper servers (abctestlab0512, abctestlab0513, abctestlab0515) and logs to see if they are running properly not overloaded?