ambari metrics collector got stopped on our machine. when we try to restart in ambari , it is failing. but when i check the processes on the machine, they are running.
Also i get ambari alerts as
|Metrics Collector - Auto-Restart Status
Metrics Collector has been auto-started 2 times since 2016-07-29 00:12:30.
I do see the following error in the logs
: 6:50:24,047 ERROR [main] ZooKeeperWatcher:652 - hconnection-0x5a7005d-0x156315434410005, quorum=localhost:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417) Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
even i tried reinstalling metrics collector. but it is not working. any thoughts on how to fix this.
I have seen a few posts in the forum, already but none helps
Ambari metrics collector is built using HBase and Phoenix and HBase uses Zookeeper. Is your Zookeeper running fine? What about HMaster? Check the following pages and its child pages for details.
I cannot say if HBase is corrupted but you can try running "hbck" from your hbase install bin directory (the one for ams). If hbck does find any inconsistency, please follow the guidelines on this page to fix the issues.
If details on above page are not enough, please see Apendix C on this link.
Thanks for the info. It actually resolved the issue for me but it seems that I have stuck with the same issue after couple of days. Is this a temporary solution (workaround)?