Created 05-23-2016 03:11 PM
HBASE-10201 enabled logic to flush based on size of CF. If the size of CF > DEFAULT_HREGION_MEMSTORE_PER_COLUMN_FAMILY_FLUSH then it will flush those CFs and not entire memstore. How to enable this? I don't see much documentation on this feature.
Created 05-23-2016 03:55 PM
There is also hbase.hregion.percolumnfamilyflush.size.lower.bound.min :
If FlushLargeStoresPolicy is used and there are multiple column families, then every time that we hit the total memstore limit, we find out all the column families whose memstores exceed a "lower bound" and only flush them while retaining the others in memory. The "lower bound" will be "hbase.hregion.memstore.flush.size / column_family_number" by default unless value of this property is larger than that. If none of the families have their memstore size more than lower bound, all the memstores will be flushed (just as usual).
Default:
16777216
Created 05-23-2016 03:19 PM
It is enabled by default(FlushLargeStoresPolicy). You just need to configure below property. It will not flush those column family who size is less than the size specified in below property
hbase.hregion.percolumnfamilyflush.size.lower.bound
Created 05-23-2016 03:55 PM
There is also hbase.hregion.percolumnfamilyflush.size.lower.bound.min :
If FlushLargeStoresPolicy is used and there are multiple column families, then every time that we hit the total memstore limit, we find out all the column families whose memstores exceed a "lower bound" and only flush them while retaining the others in memory. The "lower bound" will be "hbase.hregion.memstore.flush.size / column_family_number" by default unless value of this property is larger than that. If none of the families have their memstore size more than lower bound, all the memstores will be flushed (just as usual).
Default:
16777216
Created 05-24-2016 02:37 AM
Note: hbase.hregion.percolumnfamilyflush.size.lower.bound is used in HDP 2.3 / 2.4
hbase.hregion.percolumnfamilyflush.size.lower.bound.min would be used in HDP 2.5