Created 04-11-2017 05:30 AM
Hi All,
As I know major compaction is done by default every 7 days. However I wanted to schedule it on off business hours. For that best solution would be setup a CRON Job.
In order to schedule major compaction using CRON job, do we have to do it just by scheduling the compaction on every table manually with some delay or do we have any other method which can schedule job at a given time on all the tables in HBase ?
Created 04-11-2017 03:21 PM
You can also use a standard cron implementation via Linux. e.g.
echo "major_compact 'FOO'" | hbase shell -n
You could schedule the above to run on a specific node at your off-peak time. Be sure to monitor the output so that you can react to any possible failures.
Created 04-11-2017 08:15 AM
you can write a job and schedule it from oozie/azkaban.
Created 04-11-2017 03:21 PM
You can also use a standard cron implementation via Linux. e.g.
echo "major_compact 'FOO'" | hbase shell -n
You could schedule the above to run on a specific node at your off-peak time. Be sure to monitor the output so that you can react to any possible failures.
Created 04-14-2017 02:57 PM
Hi Josh, thank you for the inputs. I came across one more method where we can do compaction in off peak hours using hbase.offpeak.start.hour . However from this parameter I understand that it will do major compaction everyday.
So is there anyway I can use hbase.offpeak.start.hour parameter and schedule major compaction for all the tables once in a week?
Created 11-14-2022 01:15 PM
Yes it's correct what elserj said, but inside your crontab job please add
. $HOME/.bashrc;
for example:
09 15 * * 1 . $HOME/.bashrc; PATH:/compact.sh > /home/user/logfile.log 2>&1