Created 08-10-2017 05:21 AM
ambari metrics collector restarting again and again, first it refuse to connect then alerts in ambari and finally it's status OK, this is happening again and again, stopping and auto starting itself, I have gone through amabari-metrics-collector logs
Error Path:/ams-hbase-unsecure Error:KeeperErrorCode = NodeExists for /ams-hbase-unsecure
Error:KeeperErrorCode = NodeExists for /ams-hbase-unsecure/online-snapshot/acquired
zookeeper.ClientCnxn: Opening socket connection to server Will not attempt to authenticate using SASL (unknown error)
Note : hbase.zookeeper.property.clientPort=2181
Created 08-10-2017 05:25 AM
1. Are you running the AMS with proper Heap Settings? (Or with default values) Sometimes due to insufficient heap setting this can happen. Please refer to the following doc to know how to tune the AMS memory settings based on the number of nodes: https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning
2. If it still crashes then once you should try to perform the AMS data cleanup and then see if it comes up fine. https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data
3. Are you running AMS in Embedded Mode (Default) or Distributed Mode (external HBase) ?
.
Created 08-10-2017 05:25 AM
1. Are you running the AMS with proper Heap Settings? (Or with default values) Sometimes due to insufficient heap setting this can happen. Please refer to the following doc to know how to tune the AMS memory settings based on the number of nodes: https://cwiki.apache.org/confluence/display/AMBARI/Configurations+-+Tuning
2. If it still crashes then once you should try to perform the AMS data cleanup and then see if it comes up fine. https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data
3. Are you running AMS in Embedded Mode (Default) or Distributed Mode (external HBase) ?
.
Created 08-10-2017 06:39 AM
embedded mode
Created 08-10-2017 07:37 AM
First check the value of `zookeeper.znode.parent` in HBase. Set it to the same value in Ambari,
Kill all the metrics processes running on the node.
`ps -ef | grep metrics` and kill all of them as they were caching the `/hbase` value.
Watch the ambari metrics collector logs ( /var/log/ambari-metrics-collector/ambari-metrics-collector.log) while you do the below steps
Steps:
0. tail -f /var/log/ambari-metrics-collector/ambari-metrics-collector.log
1. Stop Ambari
2. Kill all the metrics processes
3. curl --user admin:admin -i -H "X-Requested-By: ambari" -X DELETE http://`hostname -f`:8080/api/v1/clusters/CLUSTERNAME/services/AMBARI_METRICS
=> Make sure you replace CLUSTERNAME with your cluster name
4. Refresh Ambari UI
5. Add Service
6. Select Ambari Metrics
7. In the configuration screen, make sure to set the value of `zookeeper.znode.parent` to what is configured in the HBase service. By default in Ambari Metrics it is set to empty value.
8. Deploy
In embedded mode then hbase.cluster.distributed should be false, and hbase.rootdir set to a local directory using the "file://" scheme.