- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Check Ambari Metrics + service check
Created on 04-10-2018 07:53 PM - edited 08-17-2019 08:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
we perform service check to Ambari Metrics
and we get the following errors - "All metrics collectors are unavailable"
what we can do regarding that , in order to solve the problem?
Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 207, in <module> AMSServiceCheck().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 375, in execute method(env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 159, in service_check raise Fail("All metrics collectors are unavailable.") resource_management.core.exceptions.Fail: All metrics collectors are unavailable.
Created 04-11-2018 09:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Few checks will be good to perform:
1. To verify is ambari is showing false Running AMS services? Please check the AMS collector hosts if the AMS collector is actually running and listening to the correct address/port:
# netstat -tnlpa | grep 6188 # hostname -f
2. Please verify if the AMS process PID are matching with the PIDs that are listed int he following files. Some times not matching PIDs causes false info in the UI
$ ps -ef | grep ^ams | grep ApplicationHistoryServer $ cat /var/run/ambari-metrics-collector/ambari-metrics-collector.pid $ ps -ef | grep ^ams | grep HMaster
$ cat /var/run/ambari-metrics-collector/hbase-ams-master.pid
.
3. Can you try restarting the AMS collector Service once to see if you notice any error int he AMS collector logs?
.
Created 04-11-2018 09:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Few checks will be good to perform:
1. To verify is ambari is showing false Running AMS services? Please check the AMS collector hosts if the AMS collector is actually running and listening to the correct address/port:
# netstat -tnlpa | grep 6188 # hostname -f
2. Please verify if the AMS process PID are matching with the PIDs that are listed int he following files. Some times not matching PIDs causes false info in the UI
$ ps -ef | grep ^ams | grep ApplicationHistoryServer $ cat /var/run/ambari-metrics-collector/ambari-metrics-collector.pid $ ps -ef | grep ^ams | grep HMaster
$ cat /var/run/ambari-metrics-collector/hbase-ams-master.pid
.
3. Can you try restarting the AMS collector Service once to see if you notice any error int he AMS collector logs?
.
Created 04-11-2018 09:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also please share the output of the following command from AMS collector host and few of the cluster nodes to verify if the AMS service version is same as Ambari Binary version or not?
# rpm -qa | grep ambari
.
Created 04-11-2018 09:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
rpm -qa | grep ambari ambari-metrics-monitor-2.6.1.0-143.x86_64 ambari-metrics-hadoop-sink-2.6.1.0-143.x86_64 ambari-agent-2.6.1.0-143.x86_64
Created 04-11-2018 09:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
abouit "restarting the AMS" , when we restart all metrics service it will also restart the AMS?
Created 04-11-2018 10:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 04-11-2018 09:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Jay after we set the corect date on all machines , now ambari metrics service check is ok , do you think this is logical ?
Created 04-11-2018 09:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the AMS basically posts a dummy metrics to the collector with start & End time as mentioned int eh script (which is a relative time) so time difference can be a valid reason for failing service checks.
get_metrics_parameters = { "metricNames": "AMBARI_METRICS.SmokeTest.FakeMetric", "appId": "amssmoketestfake", "hostname": params.hostname, "startTime": current_time - 60000, "endTime": current_time + 61000, "precision": "seconds", "grouped": "false", }
And