Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Any blocking during HBase compaction?

Solved Go to solution

Any blocking during HBase compaction?

New Contributor

Probably a real simple question, but I can't seem to find the answer.

What, if anything, is the impact on availability of a region during HBase maintenance tasks like major/minor compaction, region splits/merges, etc.

For example can we read/write to a region while it is doing a compaction, or will that get blocked until the operation has completed?

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Any blocking during HBase compaction?

I am pretty sure that it is not correct that HBAse is blocked during major compactions. I.e. I tried to find a definitive statement but didn't find any, however I am very sure that you can still read and write from a region during major compaction. However there will be a heavy impact on IO and CPU on the region servers as the storefiles are rewritten so they are normally scheduled during the night. If I am wrong on this one please clarify.

Region splits are essentially immediate since regions are logically split into two and will be rewritten during the next compactions. So there may be some impact but it should be very quick.

Region merging is an interesting question, I am not aware of the process for this.

7 REPLIES 7

Re: Any blocking during HBase compaction?

Rising Star

If a HBase table is undergoing major compaction client may encounter very low read/write throughput. Eventually clients may face connection timeout until major compaction is over.

In case of Minor compaction table is available for read and writes.

For more details refer this link

Re: Any blocking during HBase compaction?

I am pretty sure that it is not correct that HBAse is blocked during major compactions. I.e. I tried to find a definitive statement but didn't find any, however I am very sure that you can still read and write from a region during major compaction. However there will be a heavy impact on IO and CPU on the region servers as the storefiles are rewritten so they are normally scheduled during the night. If I am wrong on this one please clarify.

Region splits are essentially immediate since regions are logically split into two and will be rewritten during the next compactions. So there may be some impact but it should be very quick.

Region merging is an interesting question, I am not aware of the process for this.

Re: Any blocking during HBase compaction?

Rising Star

I haven't seen the actual source code of major compaction. For all practical reasons i have not seen any hbase client able to perform any transaction during major compaction.

Re: Any blocking during HBase compaction?

That is very weird. After all a minor compaction gets sometimes elevated to a major compaction. It would be pretty catastrophic if this would make HBase inaccessible. It is also never mentioned anywhere. I totally agree that there will be a performance impact of course.

http://www.ngdata.com/visualizing-hbase-flushes-and-compactions/

  • during flushes & compactions, HBase keeps processing put and get requests, always giving a consistent view of the data
Highlighted

Re: Any blocking during HBase compaction?

Rising Star

You might be right. In my previous experiences with HBase (with high write throughput requirements) every time client timed out and were not able to establish connection back until major compaction was over.(To be precise connection was not blocked or lost as soon as major compaction started. But gradually connection died and client were not able to reconnect until major compaction was over). It might be a side effect.

Re: Any blocking during HBase compaction?

New Contributor

Thank you both for your replies. I too have been unable to find a definitive statement about availability of table/region during major compaction. I understand that there will be impact on IO/CPU and plan on scheduling major compactions on weekends (or other periods of lower activity), but for a 24/7 application, I need to understand if the application will be unavailable/blocked during the minutes(?) of compaction.

Re: Any blocking during HBase compaction?

I am sure that there is no outage during a major compaction. Compactions are done on the store files while the old files still exist and then the files are switched out. I don;t think that basic process changes between minor and major compaction. The difference is that major compactions take all store files and remove deleted rows as well. So they have more impact on the cluster. Sometimes when all files are selected for a minor compaction he will do a major anyhow. So no unless an HBase commiter jumps in and tells me otherwise there is no outage during a major compaction.

http://www.slideshare.net/cloudera/hbasecon-2013-compaction-improvements-in-apache-hbase

Don't have an account?
Coming from Hortonworks? Activate your account here