Support Questions
Find answers, ask questions, and share your expertise

ambari metrics collector

ambari metrics collector got stopped on our machine. when we try to restart in ambari , it is failing. but when i check the processes on the machine, they are running.

Also i get ambari alerts as

Metrics Collector - Auto-Restart Status

Metrics Collector has been auto-started 2 times since 2016-07-29 00:12:30.

I do see the following error in the logs

: 6:50:24,047 ERROR [main] ZooKeeperWatcher:652 - hconnection-0x5a7005d-0x156315434410005, quorum=localhost:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417) Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)

even i tried reinstalling metrics collector. but it is not working. any thoughts on how to fix this.

I have seen a few posts in the forum, already but none helps

14 REPLIES 14

Explorer

@Aravindan Vijayan,

Clearing the contents of /var/lib/ambari-metrics-collector/hbase-tmp/* and restarting the Ambari metrics collector totally fixed our problem on HDInsight Hadoop 3.6.1 (HDP 2.6.5).

It was a little confusing because HBase is not deployed on the Azure Hadoop 3.6.1 cluster. Evidently, Ambari Metrics is running it's own internal HBase instance?

Fyi...The error in /var/log/ambari-metrics-collector/ambari-metrics-collector.log was...

21:09:16,997  INFO [main-SendThread(xxxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1032 - Opening socket connection to se
rver xxx.cx.internal.cloudapp.net/xx.xxx.xx.xx:61181. Will not attempt to authenticate using SASL (unknown error)
21:09:16,998  WARN [main-SendThread(xxx.cx.internal.cloudapp.net:61181)] ClientCnxn:1162 - Session 0x16886b291270010 for s
erver null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)

Thanks

Expert Contributor

Thanks it solved problem, but how did you get answer for this problem? I will never guess himself.

Expert Contributor

@ARUN - did you get this fixed ? @mqureshi , @Aravindan Vijayan - looping you as well.

I'm getting the same error ->

https://community.hortonworks.com/questions/70820/kerberized-hdp-24-amabari-metric-v-2220-shutting-d...

----------------------------------

2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=localhost:61181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure 2016-12-11 17:00:38,348 ERROR org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts 2016-12-11 17:00:38,348 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: hconnection-0x155aa3ef-0x158eed3ab0f0002, quorum=localhost:61181, baseZNode=/ams-hbase-secure Unable to set watcher on znode (/ams-hbase-secure) org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-secure at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:221) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:417)

Explorer

@Aravindan Vijayan - We have a 20 node Hadoop cluster. We installed the Base service completely in our cluster. The Ambari metrics collector mode - embedded. We are noticing the Ambari metrics collector starts and stops soon after some time. Checking the logs below is the error

2017-12-12 11:03:38,443 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x70cf32e3-0x1604b8318690004, quorum=hdpmprod000.corp.pgcore.com:61181, baseZNode=/ams-hbase-unsecure Received unexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /ams-hbase-unsecure/meta-region-server

Since we do not have HBase service in our cluster - Can I comment out all the AMS - HBase configuration to avoid the AMS-Hbase related issues or what is the best way to handle it.

Any help is much appreciated.

Thanks,

Abhishek

Explorer

@Geoffrey Shelton Okot,@ARUN,@Aravindan Vijayan,@Abhishek Reddy Chamakura,@Karan Alang Did you guys got a permanent solution for this issue?

we are getting the error "KeeperErrorCode = NodeExists for /ams-hbase-secure/namespace/hbase"

We are facing the same issue .

email address: tauqeerkhan@outlook.com