Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cannot start Ambari-metrics-collector

Solved Go to solution
Highlighted

Re: Cannot start Ambari-metrics-collector

Explorer
Highlighted

Re: Cannot start Ambari-metrics-collector

Explorer

Found wrong rootdir hostname, after that I am getting

2016-07-08 12:44:39,320 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.domain/172.16.164.131:2181. Will not attempt to authenticate using SASL (unknown error)
2016-07-08 12:44:39,321 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to m2.domain/172.16.164.131:2181, initiating session
2016-07-08 12:44:39,328 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server m2.domain/172.16.164.131:2181, sessionid = 0x255ca408b8d0063, negotiated timeout = 40000
2016-07-08 12:44:50,376 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:07,243 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:16,166 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:32,517 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:45:54,803 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:46:10,720 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:46:37,467 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:47:01,600 WARN org.apache.hadoop.hbase.ipc.AbstractRpcClient: Couldn't setup connection for amshbase/m2.domain@domain to amshbasemaster/m1.domain@domain
2016-07-08 12:47:01,600 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=10, retries=35, started=142264 ms ago, cancelled=false, msg=
Highlighted

Re: Cannot start Ambari-metrics-collector

Explorer

Also moving ambari-metrics-collector to another host fails in the wizard with the following error:

stderr: 
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 165, in <module>
    AMSServiceCheck().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 216, in execute
    method(env)
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/service_check.py", line 92, in service_check
    raise Fail("Metrics were not saved. Service check has failed. "
resource_management.core.exceptions.Fail: Metrics were not saved. Service check has failed. 
Connection failed.
 stdout:
2016-07-08 15:41:07,832 - Ambari Metrics service check was started.
2016-07-08 15:41:07,844 - Generated metrics:
{
  "metrics": [
    {
      "metricname": "AMBARI_METRICS.SmokeTest.FakeMetric",
      "appid": "amssmoketestfake",
      "hostname": "w1.domain",
      "timestamp": 1467992467000,
      "starttime": 1467992467000,
      "metrics": {
        "1467992467000": 0.113469705131,
        "1467992468000": 1467992467000
      }
    }
  ]
}
2016-07-08 15:41:07,844 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:17,856 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:17,857 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:27,867 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:27,867 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:37,878 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:37,878 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:47,891 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:47,892 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:41:57,904 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:41:57,905 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:07,919 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:07,919 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:17,929 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:17,930 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:27,941 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:27,942 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/
2016-07-08 15:42:37,956 - Connection failed. Next retry in 10 seconds.
2016-07-08 15:42:37,956 - Connecting (POST) to w3.domain:6188/ws/v1/timeline/metrics/

Re: Cannot start Ambari-metrics-collector

Expert Contributor

@Angel Kafazov Were you able to verify the AMS keytabs work? Most of the config changes performed above were not needed, example changes to zookeeper and znode settings : For distributed mode only config changes needed are these:

https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.1.0/bk_ambari_reference_guide/content/_configur...

When you enable security through Ambari the keytabs and principals are generated by Ambari and applied to AMS configs.

Before looking into ambari-metrics-collector.log or ambari-metrics-monitor.out, the ams-hbase daemon should be up and running fine, if not the connection timeouts are of no help since these are expected. Based on the hbase logs posted the HBase daemon tried to login and failed, so we need to figure out why it did fail. Note: If the collector was moved older keytabs would become invalid because hostname changed and would have to be re-generated.

Example of keytab commands:

http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP1/HDP-1.2.0/bk_installing_manually_book/...

View solution in original post

Highlighted

Re: Cannot start Ambari-metrics-collector

Explorer

Hi @swagle,

Thank you very much for the support. After several retries I managed to delete the service and install it again on another host. It worked, without me doing much else than before, I just had to set the zookeeper.znode.parent to the HBase value. Really don't know why it worked this time.

Don't have an account?
Coming from Hortonworks? Activate your account here