Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Ambari Metrics Collector won't start

Highlighted

Ambari Metrics Collector won't start

Explorer

On start, we get spam in the logs for the following Warning:

2017-01-04 21:01:11,912 WARN org.apache.zookeeper.ClientCnxn: Session 0x1596b43b51c0005 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2017-01-04 21:01:12,780 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error) 2017-01-04 21:01:12,781 WARN org.apache.zookeeper.ClientCnxn: Session 0x1596b43b51c0001 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125) 2017-01-04 21:01:13,237 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:61181. Will not attempt to authenticate using SASL (unknown error)

and eventually the following error:

2017-01-04 21:01:18,552 ERROR org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: hconnection-0x77a98a6a-0x1596b43b51c0005, quorum=localhost:61181, baseZNode=/falcon-ams-hbase Received u nexpected KeeperException, re-throwing exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /falcon-ams-hbase

ZK health seems ok from an ambari perspective. Logging in via the ZK CLI, I don't see /facon-ams-hbase:

[zk: localhost:2181(CONNECTED) 0] ls / [zookeeper, falcon-hbase]

Is this expected? Any idea on how to recover this?

3 REPLIES 3
Highlighted

Re: Ambari Metrics Collector won't start

Expert Contributor

The hbase.rootdir in AMS configs should not be pointing to /falcon-ams-hbase

Can you provide following details?:

1. Version of Ambari and AMS

2. Values for these properties:

- ams-site :: timeline.metrics.service.operation.mode

- ams-hbase-site :: hbase.zookeeper.quorum

- ams-hbase-site :: hbase.rootdir

3. What is the size of your cluster ?

Highlighted

Re: Ambari Metrics Collector won't start

Explorer

Hi Swagle,

Sorry for the delay. Answers to your questions below:

1. Version of Ambari and AMS

Ambari: 2.2.1.0 AMS: 0.1.0

2. Values for these properties:

- ams-site :: timeline.metrics.service.operation.mode

<name>timeline.metrics.service.operation.mode</name> <value>embedded</value>

- ams-hbase-site :: hbase.zookeeper.quorum

<name>hbase.zookeeper.quorum</name> <value>localhost</value>

- ams-hbase-site :: hbase.rootdir

<name>hbase.rootdir</name> <value>file:///srv/hadoop/ambari/var/lib/ambari-metrics-collector/hbase</value>

3. What is the size of your cluster ?

This is a smaller test cluster, with 3 2 core/2G of ram datanodes/Region servers, 1 8 core/26G of ram namenode

Re: Ambari Metrics Collector won't start

Expert Contributor

Any reason to use an year old version of Ambari?

In embedded mode AMS will start HBase in standalone mode and will not use the cluster's HBase. Therefore it is expected that you were not able to locate the AMS znode on ZK listening on 2181. From the initial logs posted I see that Collector is trying to connect on localhost/127.0.0.1:61181 which is expected. Can you check the AMS Hbase logs at same location as Collector log: /var/log/ambari-metrics-collector/hbase-ams-master-*.log, this should have the reason for failure.

Another suggestion: From the AMS configs try to revert any manual changes that you might have done back to defaults and restart. The znode name itself does not matter in embedded mode since it totally different Zookeeper.

Useful links:

https://cwiki.apache.org/confluence/display/AMBARI/Troubleshooting+Guide

https://cwiki.apache.org/confluence/display/AMBARI/Known+Issues

Don't have an account?
Coming from Hortonworks? Activate your account here