Support Questions

Find answers, ask questions, and share your expertise

HBASE Master Heap Size Recommendation

avatar
Contributor

Hi All ,

Is there any rule of thumb for setting up HBASE master heap size, in our present environment HBASE Master heap size is 8GB and the average load of regions per region server is ~6200+.

Is there any need for changing Master heap size based on regions per region server is very high.

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Super Guru

HBase Master (HMaster) has small set of responsibilities which does not require a lot of memory. HMaster is used for administrative tasks like assigning regions, updating meta table and for DDL statements. Clients do not interact with HMaster when they need to read/scan/write data.

You can easily reduce your HMaster heap size to 4 GB.

That being said, 6200 regions per region server is too high? Is this uniform across the cluster or is this a result of hot spotting on some regions which will indicate poor key design.

I used to recommend no more than 200 regions per region server and I am aware that with new improvements this can be increased to may be 500 regions on the high side but 6200 regions per region server is un heard of. If you are seeing issues with performance and running into failures then you need to fix this first. If your regions are not balanced, then see if hbase region balancer is enabled or not (enabled by default).

View solution in original post

12 REPLIES 12

avatar
Super Guru

HBase Master (HMaster) has small set of responsibilities which does not require a lot of memory. HMaster is used for administrative tasks like assigning regions, updating meta table and for DDL statements. Clients do not interact with HMaster when they need to read/scan/write data.

You can easily reduce your HMaster heap size to 4 GB.

That being said, 6200 regions per region server is too high? Is this uniform across the cluster or is this a result of hot spotting on some regions which will indicate poor key design.

I used to recommend no more than 200 regions per region server and I am aware that with new improvements this can be increased to may be 500 regions on the high side but 6200 regions per region server is un heard of. If you are seeing issues with performance and running into failures then you need to fix this first. If your regions are not balanced, then see if hbase region balancer is enabled or not (enabled by default).

avatar
Contributor

Hi,

Thanks for your reply!!

Regions are spread across region servers uniformly and also i can see HBase Master heap size utilization is going to 85-90% (eventhough heap size of HMaster is 8GB) most of the times in Ambari UI ,In our cluster we mostly rely on Hive,HBase .

avatar
Super Guru

But you still have 6200 regions per region server? Is that right? That itself can cause a lot of master activity. You need to change your settings so you have less regions per region server.

What is the heap size of your region server? What is memstore size for each table (hbase.hregion.memstore.flush.size)?

avatar
Contributor

Hi ,

Yes,that's right we are having ~6200+ regions per region server, how to change the setting for having less number of regions per region server ? can you please tell me how to enable load balancer for HBase to spread region servers across all regions equally ?

HBase region server heap size is 16GB and Memstore flush size is 128MB.

Thanks!!

avatar
Super Guru

Your region server size is just about right and memstore flush size seems reasonable too. How many column families do you have in your tables?

Use the following to determine your number of regions

Usage of the region server's memstore largely determines the maximum number of regions for the region server. Each region has its own memstores, one for each column family, which grow to a configurable size, usually between 128 and 256 Mb. Administrators specify this size with the hbase.hregion.memstore.flush.size property in the hbase-site.xml configuration file. The region server dedicates some fraction of total memory to region memstores based on the value of the hbase.regionserver.global.memstore.size configuration property. If usage exceeds this configurable size, HBase may become unresponsive or compaction storms might occur.

Use the following formula to estimate the number of Regions for a region server:

(regionserver_memory_size) * (memstore_fraction) / ((memstore_size) * (num_column_families))

For example, assume the following configuration:

  • region server with 16 Gb RAM (or 16384 Mb)
  • Memstore fraction of .4
  • Memstore with 128 Mb RAM
  • 1 column family in table

The formula for this configuration would look as follows:

(16384 Mb * .4) / ((128 Mb * 1) = approximately 51 regions

The easiest way to decrease the number of regions for this example region server is increase the RAM of the memstore to 256 Mb. The reconfigured region server would then have approximately 25 regions, and the HBase cluster will run more smoothly if the reconfiguration is applied to all region servers in the cluster. The formula can be used for multiple tables with the same configuration by using the total number of column families in all the tables.

avatar
Contributor

Hi ,

Most of our tables will have only 1 column family and hbase.regionserver.global.memstore.size is set to 0.25 and flush size is 128MB.

12202-capture.png

And yesterday we ran "balance_switch true" from hbase shell ,does it do any load balancing of regions ?

Thanks.

avatar
Super Guru

couple of questions. Assuming that load balancer was not enabled before, did you had 6200 regions per region server across the cluster or are there couple of nodes that are offending but rest is okay?

What is the size of your block cache (hfile.block.cache.size)? The sum of block cache and memstore should not be more than 0.8.

0.25 of memstore is actually on the lower end which means you are likely tuning your block cache to a higher value. Otherwise, 0.25 memstore may need to be increased. Your global memstore is 0.25 which means your memstore total should not go beyond (16 GB x 0.25 = 4GB). But with 6200 regions x 128 MB each = 793600 MB = 793.6 GB. this is pretty messed up. You need to bring that region count to a manageable value before you look into anything else. 8 GB for Master is too much but for now, focus on reducing region count.

avatar
Contributor

Hi ,

Yes,load balancer was not enabled before, i just checked yesterday in HBase Master UI regions are not evenly spread across all region servers,some of the region servers are there with very less number of regions.

The size of hfile.block.cache.size = 0.25(25%).

Yesterday we did some cleanup activity,so HBase average load came-down to 2181.07.

Please suggest what all needs to be done.

Thanks.

avatar
Super Guru

Once you have balanced the cluster, and you start seeing your regions moving to other nodes until it's reasonably balanced, then we can decide how to move forward. One of the things is to increase global memstore fraction to 0.4 and block cache to 0.4, but just hold on to it for now until your cluster is balanced. To run balancer from shell, run "balancer" command. to see if balancer is enabled, type "balance_switch". It should give you True. if it's false then run "balancer_switch true". see details on following link under "hbase surgery".

https://learnhbase.wordpress.com/2013/03/02/hbase-shell-commands/