Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

data source for ambari metrics collector

I want to get data source ambari metrics collector using for metrics it collects from different servers .

I navigated to path /var/lib/ambari-metrics-collector/hbase/data/default , but i could see many tables out there like METRIC_AGGREGATE, METRIC_AGGREGATE_HOURLY,METRIC_RECORD etc but i wanted to know the data location from where these tables are loading data.


Super Mentor

@Anurag Mishra

Please try this:

# /usr/lib/ambari-metrics-collector/bin/
0:> !tables
0:> select * from SYSTEM.STATS;
0:> select count (*) from METRIC_AGGREGATE;


Please replace "" with your Ambari Metrics Collector Hostname and the /ams-hbase-unsecure (with the value that you have for your "zookeeper.znode.parent" mentioned in your Advanced ams-hbase-site)

@Jay Kumar SenSharma

yes jay but here we are using phoenix to access over table like selecting count from METRIC_AGGREGATE , but I wanted to get the data source from where it is getting loaded into this table metric_aggregate .

Super Mentor

@Anurag Mishra
Please check what is the mode of your AMS collectore (Embedded or Distributed)

# grep -B 1 -A 2 'mode' /etc/ambari-metrics-collector/conf/ams-site.xml

If it is distributed then the Data will be stored on the HDFS.

In case of Embedded Mode the data will be stored in the filesystem something like this.

# grep -B 1 -A 2 'hbase.rootdir' /etc/ambari-metrics-collector/conf/hbase-site.xml

# grep -B 1 -A 2 'hbase.rootdir' /etc/ams-hbase/conf/hbase-site.xml

Please check the Path which is mentioned for the property "hbase.rootdir" inside your AMS configs "Advanced ams-hbase-site"

# ls -l /var/lib/ambari-metrics-collector/hbase
total 12
drwxr-xr-x. 2 ams hadoop    6 Feb 15 07:01 archive
drwxr-xr-x. 2 ams hadoop    6 Jan 15 23:24 corrupt
drwxr-xr-x. 4 ams hadoop   32 Aug 11  2017 data
-rw-r--r--. 1 ams hadoop   42 Aug 11  2017
-rw-r--r--. 1 ams hadoop    7 Aug 11  2017 hbase.version
drwxr-xr-x. 2 ams hadoop   43 Feb 15 06:05 MasterProcWALs
drwxr-xr-x. 2 ams hadoop 4096 Feb 15 07:04 oldWALs
drwxr-xr-x. 4 ams hadoop   76 Jan 24 05:49 WALs
[root@amb25102 ~]# grep -B 1 -A 2 'hbase.rootdir' /etc/ambari-metrics-collector/conf/ams-site.xml
[root@amb25102 ~]# 
[root@amb25102 ~]# grep -B 1 -A 2 'hbase.rootdi<br>


@Jay Kumar SenSharma

Hi Jay

This data would be coming from some log files, actually I needed those log files ,I am not sure but my requirement is to get source data , from where ams is collecting data .

Super Mentor

@Anurag Mishra

AMS uses HBase internally to store data. So the data files are not plain text which you can read as log files.

Do you not see the following kind of files in your filesystem?

# cd /var/lib/ambari-metrics-collector/hbase

# find . -name "*region*"


@Jay Kumar SenSharma

Yes Jay I can see all these files , but My query is ams would be fetching the data from the cluster and then putting this into hbase,

I wanted to see from where ams fetching all this data and then putting into hbase .

My requirement is to get details like Hbase Dead region server count, Hbase Cluster requests/min Hbase Connections, Hbase CPU Usage, Hbase Memory Usage, Hbase Heap Memory ,Used Hbase Process State ,Hbase Commited memory Hbase Tables Hbase Time in CPU Hbase/Region Servers Online Hbase/Region Server Request per second from the file without using hbase sql .

I was thinking If I could get the data what AMS is using then I would be able to use all this record fro my purpose with use of ams .

so basically I want to get all record without use of ams , for that I wanted files or data from where ams is collecting .

@Jay Kumar SenSharma

or in other words If metrics collector would not be there then which file ( logs ) I would be looking to get some metrics ?

Super Mentor

@Anurag Mishra

As there are multiple related threads with this query opened , Hence putting the following link which answer your query here (Please close the thread if it answers your query)


The Ambari Metrics Collector service uses psutil to collect metrics from the host system. This is a python library used for retrieving information on running processes and system utilization. You can read more on psutil within Ambari Metrics Collector here

For the various Hadoop sinks, AMS uses the Hadoop sink for collection of the various hadoop components. Source code found here

AMS uses org.apache.hadoop.metrics2.sink.timeline.cache.TimelineMetricsCache to store intermediate data until it is time to send it to HBASE. It uses the function emitMetrics to send the Metrics to Hbase whenever the specified amount of time has passed.

I hope this helps you out.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.