- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Cannot start Ambari-metrics-collector
- Labels:
-
Apache Ambari
-
Apache HBase
Created ‎07-06-2016 03:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am having difficulties getting the ambari-metrics-collector to start. I have HBase running in distributed mode.
ambari-metrics-collectorlog.txtI have attached the ambari-metrics-collector.log
I already tried the suggestions from this thread: https://community.hortonworks.com/questions/15818/ambari-metrics-collector-now-starting.html as well as the workaround for issue 6 here https://cwiki.apache.org/confluence/display/AMBARI/Known+Issues
Any tips will be very appreciated.
Created ‎07-09-2016 06:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Angel Kafazov Were you able to verify the AMS keytabs work? Most of the config changes performed above were not needed, example changes to zookeeper and znode settings : For distributed mode only config changes needed are these:
When you enable security through Ambari the keytabs and principals are generated by Ambari and applied to AMS configs.
Before looking into ambari-metrics-collector.log or ambari-metrics-monitor.out, the ams-hbase daemon should be up and running fine, if not the connection timeouts are of no help since these are expected. Based on the hbase logs posted the HBase daemon tried to login and failed, so we need to figure out why it did fail. Note: If the collector was moved older keytabs would become invalid because hostname changed and would have to be re-generated.
Example of keytab commands:
Created ‎07-06-2016 04:08 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Which version of Ambari are you using?
Created ‎07-06-2016 05:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Orivier,
It is Version2.1.2.1
Created ‎07-06-2016 05:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2016-07-06 14:59:14,466 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=m2.domain:2181 sessionTimeout=120000 watcher=hconnection- 0x7bc9e6ab0x0, quorum=m2.domain:2181, baseZNode=/hbase-secure
Looks like AMS tried to connect to hbase cluster's znode.
AMS should use /ams-hbase-secure as base znode.
Can you check your configuration ?
Created ‎07-06-2016 05:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I changed it, but now I am getting
2016-07-06 17:55:20,187 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /ams-hbase-secure is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
Created ‎07-06-2016 06:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Angel,
From your error, it looks like AMS is talking to cluster zookeeper (port 2181) . AMS in Version 2.1.2.1 uses it's own zookeeper in all modes of operation (port 61181).
Can you share your hbase-site.xml in /etc/ams-hbase/conf ? That will help us figure out the issue.
Thanks!
Created ‎07-09-2016 10:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I changed the port to 61181 by it is able to connect. I see no service running on port 61181. The following messages in the log:
2016-07-06 18:37:42,385 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.tmaut.tlabsdata.com/172.16.164.131:61181. Will not attempt to authenticate using SASL (unknown error) 2016-07-06 18:37:42,386 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server m2.tmaut.tlabsdata.com/172.16.164.131:61181. Will not attempt to authenticate using SASL (unknown error) 2016-07-06 18:37:42,386 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-07-06 18:37:42,387 WARN org.apache.zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
also attached the hbase-site.xml
Created ‎07-06-2016 06:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please revert back the znode setting to default, if cluster is not kerberized:
/ams-hbase-unsecure
Also, make sure the quorum value in ams-hbase-site is:
hbase.zookeeper.quorum
{{zookeeper_quorum_hosts}}
Created ‎07-06-2016 06:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi swagle,
the cluster is kerberized. hbase.zookeeper.quorum looks ok
Created ‎07-06-2016 09:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
See the attached doc should help.
