I have a Hbase table which got saved into two region server(56-regions) where I have 8 region servers..hope due to this when I am reading this table thru Hive getting all mappers(56) are stuck in processing..what could be the solution to spped up..What I am suspecting to distribute these 56-regions into other regions servers..any one know how can this done..
hbase> balance_switch true
Jitendra...when you say balance_switch true ...what it is doing ...the region servers data was already balanced. Here my issue us the 56-regions for the given table are loaded only into two region server instead of eight servers
So if this is the case then you have to manually move the regions for that table. Try this on few regions and see if they are moving in distributed mode. Here "SERVER_NAME" is option if you don't provide then it will pick random region server.
hbase> move ‘ENCODED_REGIONNAME’, ‘SERVER_NAME’
@ammu ch as jitendra mentionned you can balance you hbase tables. Defining how and why your compute is stuck is the first step. Skew can be one, maybe you are not using the rowkeys in a efficient manner can be another.
Alternatively To make things faster you can also use Hive to read snapshots of hbase table, this can significantly faster as the data is read of Hdfs and not through Hbase online API. This presentation will have further info if you want: http://fr.slideshare.net/HBaseCon/ecosystem-session-3a
hope this helps
Which release of HDP are you using ?
When the regions in the cluster are balanced, it is not guaranteed that regions per table would be balanced.
Here is related cost key from StochasticLoadBalancer:
private static final String TABLE_SKEW_COST_KEY = "hbase.master.balancer.stochastic.tableSkewCost"; private static final float DEFAULT_TABLE_SKEW_COST = 35;
You can increase the value for the key so that regions per table are better balanced.
What was the load like on region servers when the hive job was running ?
Have you disabled swapping ?
If you can provide some more details, that would help us determine the cause.
we are using HDP2.2 and Hbase 0.98.and Hive 0.14
yes all the region servers are already balanced ..but for this table all 56 regions are stored only in two servers..for other tables I see regions are distributed between all region servers..