Support Questions

Find answers, ask questions, and share your expertise

Hbase region split

avatar
Explorer

is it possible for Load Balancer to create new regions??

i have a table with 100 regions and all regions have size less than 3.5 GB

max region size is set to 10GB (default)

daily load is via bulk loader and approx 250MB is added to each region

but

i see on daily basis new regions are added to this table (dont know why)

New regions should be created only if size of region goes beyond 10GB and region split happens.

Any help, thoughts are appriciated

1 ACCEPTED SOLUTION

avatar
Rising Star

Yes, with IncreasingToUpeerBoundRegionSplitPolicy it is possible to have a split of a region which is far from maximum size - this is expected behavior. The reason why? HBase tries to create many regions while they are small and distribute them across the cluster. You will need to switch to ConstantSizeRegionSplitPolicy if you do not want this.

hbase.regionserver.region.split.policy controls the setting per HBase table.

View solution in original post

3 REPLIES 3

avatar
Master Collaborator

Can you provide more information (attaching region server log) ?

Load balancer wouldn't create new region.

avatar
Super Guru

@sunny malik

Which HBase version are you using? Default split policy for HBase 0.94 and above is not based on size. It is "IncreasingToUpperBoundRegionSplitPolicy". Assuming this is your split policy, and given your regions are less than 3.5GB, what you are seeing is expected behavior.

The default split policy for HBase 0.94 and trunk is IncreasingToUpperBoundRegionSplitPolicy, which does more aggressive splitting based on the number of regions hosted in the same region server. The split policy uses the max store file size based on Min (R^2 * “hbase.hregion.memstore.flush.size”, “hbase.hregion.max.filesize”), where R is the number of regions of the same table hosted on the same regionserver. So for example, with the default memstore flush size of 128MB and the default max store size of 10GB, the first region on the region server will be split just after the first flush at 128MB. As number of regions hosted in the region server increases, it will use increasing split sizes: 512MB, 1152MB, 2GB, 3.2GB, 4.6GB, 6.2GB, etc. After reaching 9 regions, the split size will go beyond the configured “hbase.hregion.max.filesize”, at which point, 10GB split size will be used from then on.

Please see the following link.

http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/

avatar
Rising Star

Yes, with IncreasingToUpeerBoundRegionSplitPolicy it is possible to have a split of a region which is far from maximum size - this is expected behavior. The reason why? HBase tries to create many regions while they are small and distribute them across the cluster. You will need to switch to ConstantSizeRegionSplitPolicy if you do not want this.

hbase.regionserver.region.split.policy controls the setting per HBase table.