- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Ambari Metrics Service Check Failed
- Labels:
-
Apache Ambari
Created ‎12-08-2016 12:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2016-12-08 19:30:35,118 - Using hadoop conf dir: /usr/isdp/current/hadoop-client/conf 2016-12-08 19:30:35,120 - checked_call['hostid'] {} 2016-12-08 19:30:35,184 - checked_call returned (0, 'a8c0366f') 2016-12-08 19:30:35,184 - Ambari Metrics service check was started. 2016-12-08 19:30:35,188 - Generated metrics: { "metrics": [ { "metricname": "AMBARI_METRICS.SmokeTest.FakeMetric", "appid": "amssmoketestfake", "hostname": "bigdata002.istuary.com", "timestamp": 1481196635000, "starttime": 1481196635000, "metrics": { "1481196635000": 0.380131946063, "1481196636000": 1481196635000 } } ] } 2016-12-08 19:30:35,188 - Connecting (POST) to bigdata002.istuary.com:6188/ws/v1/timeline/metrics/ 2016-12-08 19:30:50,232 - Connection failed. Next retry in 15 seconds. 2016-12-08 19:30:50,234 - Generated metrics: { "metrics": [ { "metricname": "AMBARI_METRICS.SmokeTest.FakeMetric", "appid": "amssmoketestfake", "hostname": "bigdata002.istuary.com", "timestamp": 1481196650000, "starttime": 1481196650000, "metrics": { "1481196650000": 0.380131946063, "1481196651000": 1481196650000 } } ] } 2016-12-08 19:30:50,234 - Connecting (POST) to bigdata002.istuary.com:6188/ws/v1/timeline/metrics/ 2016-12-08 19:31:05,250 - Connection failed. Next retry in 15 seconds. 2016-12-08 19:31:05,251 - Generated metrics: { "metrics": [ { "metricname": "AMBARI_METRICS.SmokeTest.FakeMetric", "appid": "amssmoketestfake", "hostname": "bigdata002.istuary.com", "timestamp": 1481196665000, "starttime": 1481196665000, "metrics": { "1481196665000": 0.380131946063, "1481196666000": 1481196665000 } } ] } 2016-12-08 19:31:05,251 - Connecting (POST) to bigdata002.istuary.com:6188/ws/v1/timeline/metrics/ 2016-12-08 19:31:20,266 - Connection failed. Next retry in 15 seconds. Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 184, in <module> AMSServiceCheck().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk return fn(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 102, in service_check raise Fail("Metrics were not saved. Service check has failed. " resource_management.core.exceptions.Fail: Metrics were not saved. Service check has failed. Connection failed.
Created ‎12-10-2016 03:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you all, and I have fix the bug in my program. Because I cutom my stack but I do not change the stack_advisor.py that corresponding with the stack.
Created ‎12-08-2016 12:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
my ambari version is 2.4.1, but I can not find where comes wrong.
Created ‎12-08-2016 12:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is the output of the following command. Is that package shows same version in all the hosts.
rpm -qa | ambari-metrics
.
Created ‎12-08-2016 12:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As per the code: https://github.com/apache/ambari/blob/release-2.4.1/ambari-server/src/main/resources/common-services...
Ambari will retry couple of times (AMS_CONNECT_TRIES=30) before showing that error. So can you pelase check that there is no connectivity issue and you are able to access it :
Connecting (POST) to bigdata002.istuary.com:6188/ws/v1/timeline/metrics/
The service check script runs from one of the healthy node present in the cluster (which might not be ambari server always). So from that host where the service_check ran last time can you open the mentioned URL. It should be accessible "bigdata002.istuary.com:6188".
Created ‎12-08-2016 01:04 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, you are right, but I deploy ambari on one host, so I can not find why there is some wrong with the network.
Created ‎12-08-2016 01:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is the following port opened on that host ? Or do we see any error in the AMS logs?
netstat -tnlpa | grep 6188
.
Created ‎12-08-2016 01:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
6188 port is ok, AMS logs has some error, like this in /var/log/ambari-metrics-monitor/ambari-metrics-monitor.out:
2016-12-08 19:50:34,611 [INFO] controller.py:56 - Running Controller thread: Thread-1 2016-12-08 19:50:34,611 [INFO] emitter.py:55 - Running Emitter thread: Thread-2 2016-12-08 19:50:34,611 [INFO] emitter.py:75 - Nothing to emit, resume waiting. 2016-12-08 19:51:34,614 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:51:34,614 [WARNING] emitter.py:90 - Retrying after 5 ... 2016-12-08 19:51:39,614 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:51:39,614 [WARNING] emitter.py:90 - Retrying after 5 ... 2016-12-08 19:51:44,615 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:51:44,615 [WARNING] emitter.py:90 - Retrying after 5 ... 2016-12-08 19:52:49,616 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:52:49,616 [WARNING] emitter.py:90 - Retrying after 5 ... 2016-12-08 19:52:54,616 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:52:54,617 [WARNING] emitter.py:90 - Retrying after 5 ... 2016-12-08 19:52:59,618 [WARNING] emitter.py:84 - Error sending metrics to server. [Errno 111] Connection refused 2016-12-08 19:52:59,618 [WARNING] emitter.py:90 - Retrying after 5 ...
others are ok.
Created ‎12-08-2016 01:18 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: RECEIVED SIGNAL 15: SIGTERM
And in ambari-metrics-collector.log there is above error.
Created ‎12-08-2016 02:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looks like your AMS is getting down. Can you check if it is happening frequently. Sometimes we have seens that if it has insufficient memory then it is killed and restarted again.
Created ‎12-09-2016 01:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I know the reason, because 'timeline.metrics.service.webapp.address' in the ams-site.xml file is not be modified. Why?
