Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hbase region split

Solved Go to solution

Hbase region split

Explorer

is it possible for Load Balancer to create new regions??

i have a table with 100 regions and all regions have size less than 3.5 GB

max region size is set to 10GB (default)

daily load is via bulk loader and approx 250MB is added to each region

but

i see on daily basis new regions are added to this table (dont know why)

New regions should be created only if size of region goes beyond 10GB and region split happens.

Any help, thoughts are appriciated

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Hbase region split

Contributor

Yes, with IncreasingToUpeerBoundRegionSplitPolicy it is possible to have a split of a region which is far from maximum size - this is expected behavior. The reason why? HBase tries to create many regions while they are small and distribute them across the cluster. You will need to switch to ConstantSizeRegionSplitPolicy if you do not want this.

hbase.regionserver.region.split.policy controls the setting per HBase table.

View solution in original post

3 REPLIES 3
Highlighted

Re: Hbase region split

Super Collaborator

Can you provide more information (attaching region server log) ?

Load balancer wouldn't create new region.

Highlighted

Re: Hbase region split

Super Guru

@sunny malik

Which HBase version are you using? Default split policy for HBase 0.94 and above is not based on size. It is "IncreasingToUpperBoundRegionSplitPolicy". Assuming this is your split policy, and given your regions are less than 3.5GB, what you are seeing is expected behavior.

The default split policy for HBase 0.94 and trunk is IncreasingToUpperBoundRegionSplitPolicy, which does more aggressive splitting based on the number of regions hosted in the same region server. The split policy uses the max store file size based on Min (R^2 * “hbase.hregion.memstore.flush.size”, “hbase.hregion.max.filesize”), where R is the number of regions of the same table hosted on the same regionserver. So for example, with the default memstore flush size of 128MB and the default max store size of 10GB, the first region on the region server will be split just after the first flush at 128MB. As number of regions hosted in the region server increases, it will use increasing split sizes: 512MB, 1152MB, 2GB, 3.2GB, 4.6GB, 6.2GB, etc. After reaching 9 regions, the split size will go beyond the configured “hbase.hregion.max.filesize”, at which point, 10GB split size will be used from then on.

Please see the following link.

http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/

Highlighted

Re: Hbase region split

Contributor

Yes, with IncreasingToUpeerBoundRegionSplitPolicy it is possible to have a split of a region which is far from maximum size - this is expected behavior. The reason why? HBase tries to create many regions while they are small and distribute them across the cluster. You will need to switch to ConstantSizeRegionSplitPolicy if you do not want this.

hbase.regionserver.region.split.policy controls the setting per HBase table.

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here