Created 10-28-2014 09:05 AM
Hello Jy,
Per our other thread, these monitors are continually gathering metrics regardless of level of activity in the cluster. Let's consider:
Host Monitor (min. 10GiB)
- gathers metrics around node-level entities of interest (characteristics like disk space usage, RAM, CPU, etc)
-- Remember that these kinds of metrics are important and gathered/persisted regardless of the level of activity in the cluster. These metrics are still as useful in idle periods as they are in times of heavy load.
Service Monitor (min. 10GiB)
- gathers metrics around the configured roles and services in your cluster
-- similarly, these kinds of metrics are important and gathered/persisted regardless of the level of activity in the cluster.
-- These include metrics that would inform and power health checks like the HDFS, HBase, Zookeeper and Hive Canary functions to determine and notify early of any problems with same. Those are running constantly regardless of idle/use period, as they're always relevant.
The Service Monitor also has responsibility for gathering metrics around YARN Applications being run, and Impala Queries issued. There is dedicated space aside from the above 10GiB to Service Monitor. By default, the YARN Application and Impala Query segments each use and require a minimum of 1GiB each. THESE would indeed vary or grow/recycle depending on the rate of activity within the cluster, compared to the core Host and Service Monitor functionality.
That said, depending on how long you'd like to keep detailed metrics around YARN jobs or Impala Queries, do adjust that dedicated storage space upward if appropriate, and ensure it's located on a filesystem with adequate space to accommodate the size you specify.
Regards,
--
Mark S.