Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

/var/lib/cloudera-service-monitor size

avatar
Contributor

Hi,

 

I have noticed that the /var/lib/cloudera-service-monitor and /var/lib/cloudera-host-monitor directories have been growing at concerning rate. To my understanding the default (min size limit) for both of these data storage files are 10gb each (20gb total). I have moved the /var/lib/cloudera-host-monitor over to another directory to allow enough space for the data to begin rolling. Is it safe to remove the old files in the /var/lib/cloudera-host-monitor since the data is now writting to the new directory?

 

Looking at other posts that are similar it would appear that in cdh 4.x there is an option to set the purge/expiration timeframe for the data within the /var/lib/cm data. I do not see this option in cdh 5.2.

 

 

Thanks

 

CDH 5.2

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Well, the data in "/var/lib/cloudera-[host|service]-monitor" is the sum total of the working data for these respective services. If you delete them, yes you can reclaim space, but the data in said locations will grow again up to the 10GB per service max until you shift its location.

 

I don't advocate fully deleting the data for these services in any normal scenario (because it's quite a drastic option), but if you simply must reclaim the space then it's possible to choose to lose this data and still have your cluster's core functionality remain OK. Your Health statuses will be Unknown or Bad for a short time and you will lose all Charts in the UI while the timeseries store is rebuilt and repopulated (due to the fact that you are deleting ALL the historical metrics). If those conditions are OK in your small dev cluster, then you can make your choice accordingly, sure.

 

Regards,

--

Mark S.

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

Hi Jy,

 

Within Cloudera Manager 5.0 and above, The Service and Host Monitors use this on-disk (node local to where these respective processes run) storage space instead of an RDBMS. The Service and Host Monitors each require a minimum of 10GB for their storage. 

 

In Cloudera Manager 4.x, the configuration concept was to describe how many hours/days worth of metrics you wanted to keep, and the applications would self-purge to remain within that bound. As you can imagine though, 14 days worth of metrics for a cluster with 1000 hosts could require a dramatically different amount of space to retain than a cluster with 3 nodes! Space planning with this model is possible, but difficult.

 

Now in the present with Cloudera Manager 5.x+, these two (Service Monitor and Host Monitor only) each use a dedicated amount of space. They will not exceed this amount of space but do have a minimum of 10GB each. You absolutely should adjust this amount of space upward depending on

 

1) The number of hosts in your cluster (which is relevant when thinking of what the Host Monitor does) as well as

2) The number of configured services and roles (which is relevant when thinking about what the Service Monitor does).

 

Likewise, their default data directory locations in /var/lib/cloudera-[host|service]-monitor/ are just that - a default. Do feel free to move these to a location that's more appropriate for your environment.

 

Please let me know if I can help clarify further.

 

Regards,

--

Mark S.

avatar
Contributor

With that being said and moving these to another location. Is there any reason to keep or hold onto the old data stored in the previous location "/var/lib/cloudera-[host|service]-monitor". Preferably I would like to remove the old data files if possible. In my case this is a small cluster and I have all the logs pointing to the /var partition, hence needing to relocate the directory. This would be a temp solution until I can get another partition with more space or until there are more nodes to spread out daemons.

avatar
Super Collaborator

Well, the data in "/var/lib/cloudera-[host|service]-monitor" is the sum total of the working data for these respective services. If you delete them, yes you can reclaim space, but the data in said locations will grow again up to the 10GB per service max until you shift its location.

 

I don't advocate fully deleting the data for these services in any normal scenario (because it's quite a drastic option), but if you simply must reclaim the space then it's possible to choose to lose this data and still have your cluster's core functionality remain OK. Your Health statuses will be Unknown or Bad for a short time and you will lose all Charts in the UI while the timeseries store is rebuilt and repopulated (due to the fact that you are deleting ALL the historical metrics). If those conditions are OK in your small dev cluster, then you can make your choice accordingly, sure.

 

Regards,

--

Mark S.

avatar
Contributor

Hi changed the location of these two directories (after looking for other things to delete to clear up my 98% disk usage). Re-started the Cloudera Manager. Checked that new directories are created in the new location (it was) and then just removed the /var/lib/cloudera-[host|service]-monitor directories. The result was amazing:

 

[root]# df -kh
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 19G 8.0G 9.9G 45% /

 

Cheers

Steve