Created on 11-28-2016 12:29 PM - edited 09-16-2022 03:49 AM
Hi,
Can I just delete rm -rf * from some of the log folders such as /var/log/hive
16G ./hive
Thanks,
Avijeet
Created 11-28-2016 06:13 PM
@Avijeet Dash, The suggestion from Sunile is great. But, where you can't do that, here is a solution.
If you need to manually delete all but the last X files named with a certain file pattern (*.zip, files*.log, etc), you can run something like this command which finds all but the most recent 5 matching files.
# find MY_LOG_DIR -type f -name "FILE_PATTERN" -printf "%T+\t%p\n" | sort |awk '{print $2}' |head -n -5 |xargs -i CMD_FOR_EACH_FILE {}
Replace the bold parts as needed.
For example, the following command will find all but the most recent 5 files matching pattern *.log.20##-##-## and deletes them. Note, since this command is a delete command, before running something so drastic, you should test first by replacing the "rm" with "ls -l" or do a "mv" instead. Test, test, test.
# find /var/log/hive -type f -name "*.log.20[0-9][0-9]-[0-2][0-9]-[0-9][0-9]" -printf "%T+\t%p\n" | sort |awk '{print $2}' |head -n -5 |xargs -i rm {}
There are always many ways to solve a problem and I'm sure there is a more elegant solution.
Created 11-28-2016 04:22 PM
You can remove the log files but I would recommend much easier way to have this automated.
Most services in hadoop user log4j. Simply enable RollingFileAppender and set MaxBackupIndex to max number of log files you want to retention for that service.
Created 11-28-2016 06:13 PM
@Avijeet Dash, The suggestion from Sunile is great. But, where you can't do that, here is a solution.
If you need to manually delete all but the last X files named with a certain file pattern (*.zip, files*.log, etc), you can run something like this command which finds all but the most recent 5 matching files.
# find MY_LOG_DIR -type f -name "FILE_PATTERN" -printf "%T+\t%p\n" | sort |awk '{print $2}' |head -n -5 |xargs -i CMD_FOR_EACH_FILE {}
Replace the bold parts as needed.
For example, the following command will find all but the most recent 5 files matching pattern *.log.20##-##-## and deletes them. Note, since this command is a delete command, before running something so drastic, you should test first by replacing the "rm" with "ls -l" or do a "mv" instead. Test, test, test.
# find /var/log/hive -type f -name "*.log.20[0-9][0-9]-[0-2][0-9]-[0-9][0-9]" -printf "%T+\t%p\n" | sort |awk '{print $2}' |head -n -5 |xargs -i rm {}
There are always many ways to solve a problem and I'm sure there is a more elegant solution.
Created 11-28-2016 06:26 PM
Note that the above is deletes older files based on file modification time, not based on the timestamp in the filename. I did use the filename with a timestamp, which probably makes the example confusing. So that command could be used with any kind of file such as keeping the last 5 copies of your backup files.
Also, if you use logrotate (e.g. where log4j rolling files is not an option), you can use the maxage option, which also uses modified time. This is from the logrotate man page:
maxage count Remove rotated logs older than <count> days. The age is only checked if the logfile is to be rotated. The files are mailed to the configured address if maillast and mail are configured.