Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Ambari-Metrics collector not starting

avatar
Expert Contributor

When I start ambari-metrics collector, there is no error in starting but it never starts. When I checked the log file, below is what I see:

Value of zookeper.znode.parent is: /hbase-unsecure

 retries=35, started=229269 ms ago, cancelled=false, msg=
2016-02-09 19:39:15,043 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:39:15,043 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=20, retries=35, started=249286 ms ago, cancelled=false, msg=
2016-02-09 19:39:35,080 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:39:35,081 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=21, retries=35, started=269324 ms ago, cancelled=false, msg=
2016-02-09 19:39:55,151 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:39:55,152 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=22, retries=35, started=289395 ms ago, cancelled=false, msg=
2016-02-09 19:40:15,192 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:40:15,193 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=23, retries=35, started=309436 ms ago, cancelled=false, msg=
2016-02-09 19:40:35,275 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:40:35,276 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=24, retries=35, started=329519 ms ago, cancelled=false, msg=
2016-02-09 19:40:55,283 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:40:55,283 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=25, retries=35, started=349526 ms ago, cancelled=false, msg=
2016-02-09 19:41:15,437 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:41:15,437 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=26, retries=35, started=369680 ms ago, cancelled=false, msg=
2016-02-09 19:41:35,604 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:41:35,604 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=27, retries=35, started=389847 ms ago, cancelled=false, msg=
2016-02-09 19:41:55,633 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:41:55,635 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=28, retries=35, started=409878 ms ago, cancelled=false, msg=
2016-02-09 19:42:15,646 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:42:15,646 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=29, retries=35, started=429889 ms ago, cancelled=false, msg=
2016-02-09 19:42:35,841 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:42:35,841 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=30, retries=35, started=450084 ms ago, cancelled=false, msg=
2016-02-09 19:42:56,032 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:42:56,032 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=31, retries=35, started=470275 ms ago, cancelled=false, msg=
2016-02-09 19:43:16,088 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:43:16,088 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=32, retries=35, started=490331 ms ago, cancelled=false, msg=
2016-02-09 19:43:36,265 ERROR org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: The node /hbase is not in ZooKeeper. It should have been written by the master. Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master.
2016-02-09 19:43:36,265 INFO org.apache.hadoop.hbase.client.RpcRetryingCaller: Call exception, tries=33, retries=35, started=510508 ms ago, cancelled=false, msg=


1 ACCEPTED SOLUTION

avatar
Contributor

I could finally solve it by combining some of the steps mentioned above.

I first checked what is the value of `zookeeper.znode.parent` in HBase. I tried setting that same value in Ambari, but that did not work because some of the metrics processes were already running on that machine. So, i had to `ps -ef | grep metrics` and kill all of them as they were caching the `/hbase` value.

Watch the ambari metrics collector logs ( /var/log/ambari-metrics-collector/ambari-metrics-collector.log) while you do the below steps

Steps:

0. tail -f /var/log/ambari-metrics-collector/ambari-metrics-collector.log

1. Stop Ambari

2. Kill all the metrics processes

3. curl --user admin:admin -i -H "X-Requested-By: ambari" -X DELETE http://`hostname -f`:8080/api/v1/clusters/CLUSTERNAME/services/AMBARI_METRICS

=> Make sure you replace CLUSTERNAME with your cluster name

4. Refresh Ambari UI

5. Add Service

6. Select Ambari Metrics

7. In the configuration screen, make sure to set the value of `zookeeper.znode.parent` to what is configured in the HBase service. By default in Ambari Metrics it is set to empty value.

8. Deploy

View solution in original post

31 REPLIES 31

avatar
Expert Contributor

Never mind guys, i fixed the issue. Someone moved the python and it broke the link. I've re-linked it and was able to bring the metrics-collector online.

avatar

a. ambari-metrics-monitor status

b. ambari-metrics-monitor stop (if it is running)

c. Check the ambari-metrics-collector/hbase-tmp directroy path in Ambari AMS config

d. Move the hbase zk temp directory to somewhere

mv /hadoop/journalnode/var/lib/ambari-metrics-collector/hbase-tmp/zookeeper/* /tmp/ams-zookeeper-backup/

e. Restart AMS from the Ambari


That should resolve the issue


Cheers, Pravat Sutar