Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Size of the /var/log disk is running out of control

avatar
New Contributor

I installed a 6-node cluster using Cloudera's CDH4 parcel. I am not running any jobs. The /var directory size is increasing by 2% daily! Is there any way to control the size of log files (limit the histtory of data it can store)? Is there a linux command that allows to me delete any old or stale log file?

 

Thank you 

1 ACCEPTED SOLUTION

avatar
Guru

@happynodes I have moved your thread to the Cloudera Manager board, because you mentioned that you were using Parcels and as far as I know, that is a CM specific packaging model.

 

To answer your question, there are mechanisms in CM all over the place to control the size and number of log files that are retained.  Please be aware that each and every Hadoop service, as well as the Cloudera Manager management/monitoring services will all keep their own logs, and by default those end up in /var/log.

 

For example, if you browse to your hdfs1 service page and click on "Configuration->View and Edit".  On the left-hand side, you will be able to expand several menus and see "Logs" sections, which allow you to configure the logging of that service.  Datanode, Failover Controller, and Namenode are a few examples of such.

 

What I would recommend is to try to identify which actual directory under /var/log is the culprit here and then go into CM and adjust the log retention settings for that service.

View solution in original post

5 REPLIES 5

avatar
Guru

@happynodes I have moved your thread to the Cloudera Manager board, because you mentioned that you were using Parcels and as far as I know, that is a CM specific packaging model.

 

To answer your question, there are mechanisms in CM all over the place to control the size and number of log files that are retained.  Please be aware that each and every Hadoop service, as well as the Cloudera Manager management/monitoring services will all keep their own logs, and by default those end up in /var/log.

 

For example, if you browse to your hdfs1 service page and click on "Configuration->View and Edit".  On the left-hand side, you will be able to expand several menus and see "Logs" sections, which allow you to configure the logging of that service.  Datanode, Failover Controller, and Namenode are a few examples of such.

 

What I would recommend is to try to identify which actual directory under /var/log is the culprit here and then go into CM and adjust the log retention settings for that service.

avatar
Contributor

Hi,

 

I'm having the same issue on CD 5.4. I don't seems to see the same option compared to CD4. Can anyone help as one of my nodes is at 98%!

 

Cheers

 

avatar
Contributor

This is what I've done:

 

I’ve changed the location of the logging directory for all the services to use a partition with very little usage.

The advantage of this is that you can simply do a “rm –rf” on this directory if and when you encounter the Cloudera Manager complaining about disk space (after stopping the cluster). Once restarting the cluster CDM even creates the relevant subdirectories. I’ve tested this already.

 

To do this for each service select “Configuration” then select “Display All” on the bottom of the page and search for “/var/log/” and replace this with your preferred directory. e.g. /var/log/zookeeper to <new_path>/zookeeper.

You’ll need the restart the service/cluster depending on the service.

avatar
Contributor

Hi,

 

I have faced a situtation of this kind where after 3 days of installation cloudera manager did not work and the root cause is /var/ file system is fully occupied.

 

What has been done.

 

1) Manually deleted un necessary files in /var partition.

2) Logged in to clouderamanager web UI after successful removal of unwanted files.

3) Changed log location of few services to new partition which has full vacant space.

 

After this, after few days there was an issue like logs were not updated properly.

 

Root cause for the same is:

 

We should be careful when we move the original file log location to new location.

File system permissions should be owned by respective users and for example /<new partition>/log/hive should be owned by hive user but not by root user.

 

New log location /<new partition>/log/hive was created as root user and this was an issue.

 

@steveandbee: Are the internal log folders created automatically after cluster restart?

avatar
New Contributor

Clint,

 

  Is there any guidance *against* using logrotate for a cluster, if it is installed?

 

Thanks,

Chris