Created on 01-23-2018 11:27 PM - edited 09-16-2022 05:46 AM
Hello, the host where NameNode and AMS' services run was filled up. I solved it but then AMS Collector doesn't start.
This is the AMS Collector's error message:
2018-01-23 10:29:01,077 INFO [main] zookeeper.ZooKeeper: Initiating client connection, connectString=hw.example.com:61181 sessionTimeout=120000 watcher=org.apache.hadoop.hbase.zookeeper.PendingWatcher@c540f5a 2018-01-23 10:29:01,095 INFO [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error) 2018-01-23 10:29:01,114 WARN [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2018-01-23 10:29:02,222 INFO [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error) 2018-01-23 10:29:02,222 WARN [main-SendThread(hw.example.com:61181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2018-01-23 10:29:02,324 WARN [main] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=hw.example.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master 2018-01-23 10:29:02,324 ERROR [main] zookeeper.RecoverableZooKeeper: ZooKeeper getData failed after 1 attempts 2018-01-23 10:29:02,324 WARN [main] zookeeper.ZKUtil: clean znode for master0x0, quorum=hw.example.com:61181, baseZNode=/ams-hbase-secure Unable to get data of znode /ams-hbase-secure/master org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267) at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838) 2018-01-23 10:29:02,325 ERROR [main] zookeeper.ZooKeeperWatcher: clean znode for master0x0, quorum=hw.example.com:61181, baseZNode=/ams-hbase-secure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267) at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838) 2018-01-23 10:29:02,325 WARN [main] zookeeper.ZooKeeperNodeTracker: Can't get or delete the master znode org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:354) at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:714) at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:267) at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:149) at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2838)
2018-01-23 10:29:01,191 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error) 2018-01-23 10:29:01,192 WARN org.apache.zookeeper.ClientCnxn: Session 0x16123a0fb540000 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) 2018-01-23 10:29:01,298 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=hw.example.com:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Conn ectionLoss for /ams-hbase-secure/meta-region-server 2018-01-23 10:29:02,339 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server hw.example.com/10.1.0.12:61181. Will not attempt to authenticate using SASL (unknown error) 2018-01-23 10:29:02,340 WARN org.apache.zookeeper.ClientCnxn: Session 0x16123a0fb540000 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
There aren't services listen on ports 6188 and 61181. I've configured the HBase's ticktime "hbase.zookeeper.property.tickTime = 6000".
Thanks in advance.
Created 01-23-2018 11:27 PM
I solved it following this instructions: https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data
Created 01-23-2018 11:27 PM
I solved it following this instructions: https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data