Created 07-29-2016 05:44 AM
ambari metrics collector got stopped on our machine. when we try to restart in ambari , it is failing. but when i check the processes on the machine, they are running.
Also i get ambari alerts as
Metrics Collector - Auto-Restart Status
Metrics Collector has been auto-started 2 times since 2016-07-29 00:12:30. |
I do see the following error in the logs
: 6:50:24,047 ERROR [main] ZooKeeperWatcher:652 - hconnection-0x5a7005d-0x156315434410005, quorum=localhost:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417) Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)
even i tried reinstalling metrics collector. but it is not working. any thoughts on how to fix this.
I have seen a few posts in the forum, already but none helps
Created 08-01-2016 10:40 PM
Hi @ARUN,
Please clear the contents of /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/* and restart Ambari metrics collector. Let me know if that works!
Thanks
Created 07-29-2016 06:08 AM
Ambari metrics collector is built using HBase and Phoenix and HBase uses Zookeeper. Is your Zookeeper running fine? What about HMaster? Check the following pages and its child pages for details.
Created 07-29-2016 06:16 AM
@mqureshi, we are using the zookeeper provided by AMS and even the hbase is ams-hbase. metrics collector was abruptly stopped. would it have corrupted . do i need to check anything on that.
Created 07-29-2016 06:29 AM
I cannot say if HBase is corrupted but you can try running "hbck" from your hbase install bin directory (the one for ams). If hbck does find any inconsistency, please follow the guidelines on this page to fix the issues.
http://hbase.apache.org/0.94/book/apbs03.html
If details on above page are not enough, please see Apendix C on this link.
Created 08-02-2016 04:20 PM
Hbase in AMS is a standalone HBase and there is no need for hbck.
Created 08-01-2016 10:40 PM
Hi @ARUN,
Please clear the contents of /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/* and restart Ambari metrics collector. Let me know if that works!
Thanks
Created 10-26-2016 08:10 AM
I've had the same issue, and your solution helped me. Thanks!
Created 12-08-2016 09:38 AM
I've had the same issue and the solution helped @Aravindan Vijayan
Created 05-24-2017 10:20 AM
Thanks for the info. It actually resolved the issue for me but it seems that I have stuck with the same issue after couple of days. Is this a temporary solution (workaround)?
Thanks,
Tamil.
Created 03-21-2018 06:27 AM
Hi @Aravindan Vijayan I have the same issue and deleted files but still AMS doesn't start
Created 01-26-2019 03:49 AM
Clearing the contents of /var/lib/ambari-metrics-collector/hbase-tmp/* and restarting the Ambari metrics collector totally fixed our problem on HDInsight Hadoop 3.6.1 (HDP 2.6.5).
It was a little confusing because HBase is not deployed on the Azure Hadoop 3.6.1 cluster. Evidently, Ambari Metrics is running it's own internal HBase instance?
Fyi...The error in /var/log/ambari-metrics-collector/ambari-metrics-collector.log was...
21:09:16,997 INFO [main-SendThread(xxxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1032 - Opening socket connection to se rver xxx.cx.internal.cloudapp.net/xx.xxx.xx.xx:61181. Will not attempt to authenticate using SASL (unknown error) 21:09:16,998 WARN [main-SendThread(xxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1162 - Session 0x16886b291270010 for s erver null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
Thanks
Created 02-27-2022 11:00 PM
This solution is works for me on HDP 3.1.4 Ambari 2.7
Thanks for sharing.
Created 07-31-2019 11:22 AM
Thanks it solved problem, but how did you get answer for this problem? I will never guess himself.
Created 12-11-2016 05:06 PM
@ARUN - did you get this fixed ? @mqureshi , @Aravindan Vijayan - looping you as well.
I'm getting the same error ->
----------------------------------
2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure 2016-12-11 17:00:38,348 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts 2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x155aa3ef-0x158eed3ab0f0002, quorum=localhost:61181, baseZNode=/ams-hbase-secure Unable to set watcher on znode (/ams-hbase-secure) org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417)
Created 12-22-2017 08:16 PM
@Aravindan Vijayan - We have a 20 node Hadoop cluster. We installed the Base service completely in our cluster. The Ambari metrics collector mode - embedded. We are noticing the Ambari metrics collector starts and stops soon after some time. Checking the logs below is the error
2017-12-12 11:03:38,443 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x70cf32e3-0x1604b8318690004, quorum=hdpmprod000.corp.pgcore.com:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server
Since we do not have HBase service in our cluster - Can I comment out all the AMS - HBase configuration to avoid the AMS-Hbase related issues or what is the best way to handle it.
Any help is much appreciated.
Thanks,
Abhishek
Created 06-29-2018 12:00 PM
@Geoffrey Shelton Okot,@ARUN,@Aravindan Vijayan,@Abhishek Reddy Chamakura,@Karan Alang Did you guys got a permanent solution for this issue?
we are getting the error "KeeperErrorCode = NodeExists for /ams-hbase-secure/namespace/hbase"
We are facing the same issue .
email address: tauqeerkhan@outlook.com