Support Questions

Find answers, ask questions, and share your expertise

ambari metrics collector restarting again and again

avatar

ambari metrics collector restarting again and again, first it refuse to connect then alerts in ambari and finally it's status OK, this is happening again and again, stopping and auto starting itself, I have gone through amabari-metrics-collector logs

Error Path:/ams-hbase-unsecure Error:KeeperErrorCode = NodeExists for /ams-hbase-unsecure

Error:KeeperErrorCode = NodeExists for /ams-hbase-unsecure/online-snapshot/acquired

zookeeper.ClientCnxn: Opening socket connection to server Will not attempt to authenticate using SASL (unknown error)

Note : hbase.zookeeper.property.clientPort=2181

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Anurag Mishra

1. Are you running the AMS with proper Heap Settings? (Or with default values) Sometimes due to insufficient heap setting this can happen. Please refer to the following doc to know how to tune the AMS memory settings based on the number of nodes: https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning

2. If it still crashes then once you should try to perform the AMS data cleanup and then see if it comes up fine. https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data

3. Are you running AMS in Embedded Mode (Default) or Distributed Mode (external HBase) ?

.

View solution in original post

3 REPLIES 3

avatar
Master Mentor

@Anurag Mishra

1. Are you running the AMS with proper Heap Settings? (Or with default values) Sometimes due to insufficient heap setting this can happen. Please refer to the following doc to know how to tune the AMS memory settings based on the number of nodes: https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning

2. If it still crashes then once you should try to perform the AMS data cleanup and then see if it comes up fine. https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data

3. Are you running AMS in Embedded Mode (Default) or Distributed Mode (external HBase) ?

.

avatar

embedded mode

avatar
Master Mentor

@Anurag Mishra

First check the value of `zookeeper.znode.parent` in HBase. Set it to the same value in Ambari,

Kill all the metrics processes running on the node.

`ps -ef | grep metrics` and kill all of them as they were caching the `/hbase` value.

Watch the ambari metrics collector logs ( /var/log/ambari-metrics-collector/ambari-metrics-collector.log) while you do the below steps

Steps:

0. tail -f /var/log/ambari-metrics-collector/ambari-metrics-collector.log

1. Stop Ambari

2. Kill all the metrics processes

3. curl --user admin:admin -i -H "X-Requested-By: ambari" -X DELETE http://`hostname -f`:8080/api/v1/clusters/CLUSTERNAME/services/AMBARI_METRICS

=> Make sure you replace CLUSTERNAME with your cluster name

4. Refresh Ambari UI

5. Add Service

6. Select Ambari Metrics

7. In the configuration screen, make sure to set the value of `zookeeper.znode.parent` to what is configured in the HBase service. By default in Ambari Metrics it is set to empty value.

8. Deploy

In embedded mode then hbase.cluster.distributed should be false, and hbase.rootdir set to a local directory using the "file://" scheme.