No data available for Ambari Metric Collector

Rising Star

my ambari metric collector always doesn't work.

In log, I found

Failed to get result with timeout, timeout = 300000ms row 'METRIC_AGGREGATE' on table 'SYSTEM.CATALOG' at region=SYSTEM.CATALOG, ***** host name=[hostname], [port] *** seNum=6....causes by ....IOException: Failed to get result with timeout, timeout=300000ms

then stuck at

org.apache.hadoop.hbase.client.AsynProcess:#1 waiting for 28379 actions to finish

It seems relate to hbase, but my hbase runs well without any error

If I restart ambari metric collector, it recovers immediately, and become unavailable again after several hours.

How to fix it? Thanks


Super Mentor

@Junfeng Chen

1. Which version of AMS are you using?

If you are using Amabri 2.6.x then i will suggest you to use ambari metrics collector that comes with Ambari is much more stable and has many additional fixes included in it.

2. Can you please share the logs resent inside the /var/log/ambari-metrics-collector/ share ambari-metrics-collector.log, ams-hbase-master.log, GC logs collector-gc.log and gc.log

3. When the AMS is running fine that time can you please collect the output of the following API calls and attach the JSON output here which will help us in knowing if there is a need for any Performance tuning.



Rising Star

I am using Ambari

The ambari metrics collector log is stored on production environment. I cannot export it. And I can give you the error log by hand typing copy:

now In embedded mode,

ambari metrics log:

MetaDataProtos$MetaDataService for row \x00\x00METRIC_RECORD


Caused by java.lang.InterruptedException

ams hbase log:

FSDataInputStreamWrapper: Failed to invoke 'unbuffer' method in class class org.apache.hadoop.fs.FSDataInputStream

So there may be a TCP socket connection left open in CLOSE_WAIT state


caused by java.lang.UnsupportedOperationException: this stream does not support ubbuffering

All settings for ams are in default. There are 6 nodes in total in cluster. The host running ams has 64 cores and 256GB. Currently it has 53GB free memory and 233GB free memory in cache

Super Mentor

@Junfeng Chen

You seems to be hitting a known bug of Ambari Metrics Collector 2.6.1 which causes too many CLOSE_WAIT sockets and ultimately leads to AMS shutdown after some time.

You will notice a growing CLOSE_WAIT socket over a period of time.

# netstat -anlp | grep :6188 | grep CLOSE_WAIT | wc -l


I am sure that apart from the following entry in your logs

TCP socket connection left open in CLOSE_WAIT state

You will also find the following kind of logging inside your "/var/log/ambari-metrics-collector/hbase-ams-master-*.log

" log if you notice that line means you are hitting the same bug.

Failed to invoke 'unbuffer' method in class class org.apache.hadoop.fs.FSDataInputStream



You should upgrade to Ambari Which has many additional fixes including some security related fixes from AMS perspective.

Post Upgrade Steps which includes AMS upgrade is mentioned in the following Doc:

Rising Star

for netstat -anlp | grep :6188 | grep CLOSE_WAIT | wc -l

I get 0. May my restarting the service solve this problem temporarily.

You mean the CLOSE_WAITING problem relates to

Failed to invoke 'unbuffer' method inclassclass org.apache.hadoop.fs.FSDataInputStream


Super Mentor

@Junfeng Chen

Yes CLOSE_WAIT connection message in your logs indicates to the same Bug which is addressed in Ambari

TCP socket connection left open in CLOSE_WAIT state


So better to Upgrade.

Restarting the AMS will temporarily fix the issue but after some time you may again notice that the AMS went down. Permenent remedy will be to upgrade the Ambari amd AMS to

Rising Star

OK @Jay Kumar SenSharma

So can I upgrade the ambari metrics service individually , besides to the whole Ambari?

Super Mentor

@Junfeng Chen

Yes, but better to upgrade ambari amd AMS together as that is most recommended way.

