Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

ambari metrics collector

ambari metrics collector got stopped on our machine. when we try to restart in ambari , it is failing. but when i check the processes on the machine, they are running.

Also i get ambari alerts as

Metrics Collector - Auto-Restart Status

Metrics Collector has been auto-started 2 times since 2016-07-29 00:12:30.

I do see the following error in the logs

: 6:50:24,047 ERROR [main] ZooKeeperWatcher:652 - hconnection-0x5a7005d-0x156315434410005, quorum=localhost:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417) Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)

even i tried reinstalling metrics collector. but it is not working. any thoughts on how to fix this.

I have seen a few posts in the forum, already but none helps

1 ACCEPTED SOLUTION

Expert Contributor

Hi @ARUN,

Please clear the contents of /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/* and restart Ambari metrics collector. Let me know if that works!

Thanks

View solution in original post

15 REPLIES 15

Super Guru
@ARUN

Ambari metrics collector is built using HBase and Phoenix and HBase uses Zookeeper. Is your Zookeeper running fine? What about HMaster? Check the following pages and its child pages for details.

https://cwiki.apache.org/confluence/display/AMBARI/Metrics

@mqureshi, we are using the zookeeper provided by AMS and even the hbase is ams-hbase. metrics collector was abruptly stopped. would it have corrupted . do i need to check anything on that.

Super Guru

@ARUN

I cannot say if HBase is corrupted but you can try running "hbck" from your hbase install bin directory (the one for ams). If hbck does find any inconsistency, please follow the guidelines on this page to fix the issues.

http://hbase.apache.org/0.94/book/apbs03.html

If details on above page are not enough, please see Apendix C on this link.

Hbase in AMS is a standalone HBase and there is no need for hbck.

Expert Contributor

Hi @ARUN,

Please clear the contents of /var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/* and restart Ambari metrics collector. Let me know if that works!

Thanks

Expert Contributor

Hi @Aravindan Vijayan

I've had the same issue, and your solution helped me. Thanks!

Expert Contributor

I've had the same issue and the solution helped @Aravindan Vijayan

Contributor

Hi @Aravindan Vijayan

Thanks for the info. It actually resolved the issue for me but it seems that I have stuck with the same issue after couple of days. Is this a temporary solution (workaround)?

Thanks,

Tamil.

Expert Contributor

Hi @Aravindan Vijayan I have the same issue and deleted files but still AMS doesn't start

Explorer

@Aravindan Vijayan,

Clearing the contents of /var/lib/ambari-metrics-collector/hbase-tmp/* and restarting the Ambari metrics collector totally fixed our problem on HDInsight Hadoop 3.6.1 (HDP 2.6.5).

It was a little confusing because HBase is not deployed on the Azure Hadoop 3.6.1 cluster. Evidently, Ambari Metrics is running it's own internal HBase instance?

Fyi...The error in /var/log/ambari-metrics-collector/ambari-metrics-collector.log was...

21:09:16,997  INFO [main-SendThread(xxxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1032 - Opening socket connection to se
rver xxx.cx.internal.cloudapp.net/xx.xxx.xx.xx:61181. Will not attempt to authenticate using SASL (unknown error)
21:09:16,998  WARN [main-SendThread(xxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1162 - Session 0x16886b291270010 for s
erver null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)

Thanks

New Contributor

This solution is works for me on HDP 3.1.4 Ambari 2.7

 

Thanks for sharing.

Expert Contributor

Thanks it solved problem, but how did you get answer for this problem? I will never guess himself.

Expert Contributor

@ARUN - did you get this fixed ? @mqureshi , @Aravindan Vijayan - looping you as well.

I'm getting the same error ->

https://community.hortonworks.com/questions/70820/kerberized-hdp-24-amabari-metric-v-2220-shutting-d...

----------------------------------

2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure 2016-12-11 17:00:38,348 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts 2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x155aa3ef-0x158eed3ab0f0002, quorum=localhost:61181, baseZNode=/ams-hbase-secure Unable to set watcher on znode (/ams-hbase-secure) org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417)

Explorer

@Aravindan Vijayan - We have a 20 node Hadoop cluster. We installed the Base service completely in our cluster. The Ambari metrics collector mode - embedded. We are noticing the Ambari metrics collector starts and stops soon after some time. Checking the logs below is the error

2017-12-12 11:03:38,443 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x70cf32e3-0x1604b8318690004, quorum=hdpmprod000.corp.pgcore.com:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server

Since we do not have HBase service in our cluster - Can I comment out all the AMS - HBase configuration to avoid the AMS-Hbase related issues or what is the best way to handle it.

Any help is much appreciated.

Thanks,

Abhishek

Explorer

@Geoffrey Shelton Okot,@ARUN,@Aravindan Vijayan,@Abhishek Reddy Chamakura,@Karan Alang Did you guys got a permanent solution for this issue?

we are getting the error "KeeperErrorCode = NodeExists for /ams-hbase-secure/namespace/hbase"

We are facing the same issue .

email address: tauqeerkhan@outlook.com