Support Questions

Find answers, ask questions, and share your expertise

Cannot start Ambari-Metrics Collector on HDP 2.5

I am trying to start Ambari Metrics through Ambari Web UI. However, the Metrics Collector does not start even after multiple tries. I looked into the log files and the problem seems to be with the zookeeper client connection. The connection is successfully established and a session is initiated, but after the phoenix metrics system has started, no further data can be read from the session. The socket connection is assumed to be closed and after several retries the process is aborted.

Kindly provide a working solution, or let me know if more information is required.

1 ACCEPTED SOLUTION

I have shifted to HDP 2.4, and you just need to start Metrics Monitors from Ambari to make everything work. There's probably a problem with HDP 2.5. Please notify if you have a working solution for HDP 2.5.

View solution in original post

10 REPLIES 10

Super Mentor

@Priyansh Saxena

Is it Embedded or distributed Metrics collector?

Did you trying cleaning up Zookeeper state and restarting, sometimes it might happen due to improper shutdown in embedded mode the state gets corrupted.

Please find the value of "hbase.tmp.dir", in the AMS configs (default = /var/lib/ambari-metrics-collector/hbase-tmp/) then try the following

rm -rf /var/lib/ambari-metrics-collector/hbase-tmp/
OR
mv /var/lib/ambari-metrics-collector/hbase-tmp  /Backup_Dir

- Also try remove the AMS zookeeper data by backing up and removing the contents of 'hbase.tmp.dir'/zookeeper'

and remove any Phoenix spool files from 'hbase.tmp.dir'/phoenix-spool folder

- The try restarting AMS.

- Still if the issue persist then can you please share the complete stack trace of the error

Reference: "Cleaning up Ambari Metrics System Data"

https://cwiki.apache.org/confluence/display/AMBARI/Cleaning+up+Ambari+Metrics+System+Data

.

@Jay SenSharma

Mine is an embedded Metrics Collector. I have located the folders, but somehow I am unable to remove the files inside this folder. It says 'invalid argument' for every folder. Please suggest a workaround, or if there's a problem with the way I did it. I am adding a screenshot of the commands I issued inside the hbase.tmp.dir.

11251-post.png

Expert Contributor

Can you please provide the output of the command "lsattr" in your hbase-tmp folder.

@Edgar Daeds, did you mean "ls -attr" ? PFA the result for that. "lsattr" did not seem to work.

post2.png

Expert Contributor

ok, nevermind. "lsattr" command lists attributes of file, and some attribute can be used to block file deletion, but when it is set, the output is different, so thats not the reason.

Instead of deleting the whole folder, delete only the content of /hbase-tmp/zookeeper/zookeeper_0/version-2/* and restart Metrics Collector

@Edgar Daeds, followed the instructions given above. The Metrics Collector showed "Started" as the status after the restart. However, upon refreshing the page, the status came back to "Stopped". Do you think this is some sort of an issue in HDP 2.5 ? I started Metrics Collector in HDP 2.4 and it seems to work fine over there.

Expert Contributor

I am having the same issue. When I restart Metrics Collector sometimes it goes down after a while. Deleting the content of hbase-tmp/.../version-2/* helps. I am using HDP 2.5.

Could you please share the collector log?

The log-file: ambari-metrics-collector.zip

When you say "for a while", does it mean that it comes back up normally after you restart it ? I tried deleting the contents of version-2/* multiple times (the directory was empty anyway).

Expert Contributor

Thanks. I said after a while and I meant that after 30secs-1min Ambari Collector goes down.

Ok then, if you cant remove the whole folder, did you try to move it to another place? i.e. /tmp.

What operating system are you using?

Try to move/delete also the location of "hbase.rootdir" (Ambari Metrics -> Config -> Advanced ams-hbase-site)

I have shifted to HDP 2.4, and you just need to start Metrics Monitors from Ambari to make everything work. There's probably a problem with HDP 2.5. Please notify if you have a working solution for HDP 2.5.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.