- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Schedule Major Compaction Using Cron Job
- Labels:
-
Apache HBase
Created ‎04-11-2017 05:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi All,
As I know major compaction is done by default every 7 days. However I wanted to schedule it on off business hours. For that best solution would be setup a CRON Job.
In order to schedule major compaction using CRON job, do we have to do it just by scheduling the compaction on every table manually with some delay or do we have any other method which can schedule job at a given time on all the tables in HBase ?
Created ‎04-11-2017 03:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can also use a standard cron implementation via Linux. e.g.
echo "major_compact 'FOO'" | hbase shell -n
You could schedule the above to run on a specific node at your off-peak time. Be sure to monitor the output so that you can react to any possible failures.
Created ‎04-11-2017 08:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you can write a job and schedule it from oozie/azkaban.
Created ‎04-11-2017 03:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can also use a standard cron implementation via Linux. e.g.
echo "major_compact 'FOO'" | hbase shell -n
You could schedule the above to run on a specific node at your off-peak time. Be sure to monitor the output so that you can react to any possible failures.
Created ‎04-14-2017 02:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Josh, thank you for the inputs. I came across one more method where we can do compaction in off peak hours using hbase.offpeak.start.hour . However from this parameter I understand that it will do major compaction everyday.
So is there anyway I can use hbase.offpeak.start.hour parameter and schedule major compaction for all the tables once in a week?
Created ‎11-14-2022 01:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes it's correct what elserj said, but inside your crontab job please add
. $HOME/.bashrc;
for example:
09 15 * * 1 . $HOME/.bashrc; PATH:/compact.sh > /home/user/logfile.log 2>&1
