Hbase works smoothly in auto pilot mode if one knows how to tune several of the knobs on its dashboard. Not only its important to understand each knob but also what its dependencies are with other knobs. There are several parameters which require tuning based on your use case or work load to make Hbase work in an optimized way. I will try to explain some of the basic parameters in this article. More advanced parameters would be covered in next article.
To many’s surprise , a master in Hbase does not do any heavy lifting and hence never require more than 4 - 8 GB in regular setups. Master is basically responsible for meta operations such as create/ delete of tables , keeping check on region servers’ well being using watchers on zookeeper znodes , re-distribution of regions during startup (balancer) or when a region server shuts down. Please note that master's assignment manager keeps track of region states in this memory only and hence if you have huge number of tables / regions, you ought to have proportional amount of heap for master.
This is a very crucial parameter for region server as most of the data loading / processing would happen in allocated region server heap. This is the heap memory which would accommodate block cache to make your reads faster, and it is this heap that would have region memstores to hold all the writes coming from your users (until they get flushed to the disk). But what is the best value for this heap size? How do we calculate this number?
Well there is no direct formula for this , but if you are using CMS GC algorithm for JVMs, your hard stop for heap is about 36 - 38 GB , otherwise long "stop the world" GC pauses would turn Hbase not only unusable but bring lot of complications w.r.t. the data being stored. Use your best judgement based on number of regions hosted currently, your future projections, any number between 16 GB - 36 GB is a good number, also, you should have a proper plan to tune this parameter incrementally over time based on cluster usage and number of regions added to nodes. With G1GC algorithm there is no restriction on heap size.
One can always check heap usage from Ambari > Hbase> Master UI > Memory tab, if utilization shoots during peak hours to about 60 - 70 % of total heap , its time to increase the heap size further. (unless its a case of memory leak).
This parameter sets upper bound on region server heap’s young generation size. Rule of thumb is to keep 1/8th - 1/10th of total heap and never exceeding 4000 Mb.
4. Number of regions on each region server
Discussing this aspect of tuning here as it would help you figure out the best heap size for region servers, memstore size as well as make you explain how these parameters are all dependent on number of regions and how performance of hbase is dependent on all these numbers. We never recommend more than 200 - 400 regions per region server. One can figure out if the existing count in his cluster is an optimized number or not using below formula :
The formula for this configuration would look as follows:
(16384 Mb * .4) / ((128 Mb * 1) = approximately 51 regions
This is the portion of total heap which would be used by block cache to make your reads even faster. The data once accessed from disk gets loaded in this cache and the next time any user requests same data, its served from here which is way faster than being served from disk. Caveat here is, keep this number big only if :
a. You have a heavy read use case.
b. Even in read heavy use case , you have your users requesting same data repetitively.
If both conditions do not match , you will be wasting a whole lot of heap loading unnecessary data blocks. In matching conditions, any value between 20 - 40 % is a good value. Again,need to be tuned using trial and error method and what works best for you.
Portion of total heap used for all the memstores opened for each column family per region per table. This is where all the edits and mutations get landed first during write operation.For write heavy use cases, any value between 20 - 40 % is a good value. Also note that sum of block cache as explained in point 4 above and global memstore size should never be greater than 70 - 75% of total heap, this is so that we have enough heap available for regular hbase operations apart from read and write caching.
This is the size of each memstore opened for a single column family , during write operations, when this size is used completely , the memstore gets flushed to disk in the form of a hfile. Also to note here that all memstores for a single region would get flushed even if any one of them reaches this size. Each flush operation creates an hfile , so smaller this number, chances of having more frequent flushes, more IO overhead, greater the number of Hfiles getting created and subsequently, greater the number of Hfiles , quicker the compaction getting triggered. And we understand compaction involves additional round of write operations as it writes smaller Hfiles into a bigger Hfile and hence proving to be significant overhead if getting triggered very frequently.
Thus a significantly bigger flush size would ensure lesser Hfiles and lesser compactions, but caveat here is the total heap size and number of regions and column families on each region server. If you have too many regions and column families, you cannot afford to have a bigger flush size under limited total heap size. Ideal numbers are anything between 128 MB to 256 MB.
This is a simple tuning parameter, allows single memstore to get stretched by this multiplier during heavy bursty writes. Once memstore reaches this size (flush size X multiplier ) , write operations are blocked on this column family until flushes are completed.
This parameter defines the number of RPC listeners / threads that are spun up to answer incoming requests from users. The default value is 30. Good to keep it higher if more concurrent users are trying to access Hbase , however the value should also be proportional to number of CPU cores and region server heap you have on each region server as each thread consumes some amount memory and CPU cycles.
A rule of thumb is to keep the value low when the payload for each request is large, and keep the value high when the payload is small. Start with a value double the number of cores on the node and increase it as per the requirements further.