Member since
07-13-2016
7
Posts
2
Kudos Received
0
Solutions
08-17-2016
09:05 PM
1 Kudo
is it possible for Load Balancer to create new regions?? i have a table with 100 regions and all regions have size less than 3.5 GB max region size is set to 10GB (default) daily load is via bulk loader and approx 250MB is added to each region but i see on daily basis new regions are added to this table (dont know why) New regions should be created only if size of region goes beyond 10GB and region split happens. Any help, thoughts are appriciated
... View more
Labels:
- Labels:
-
Apache HBase
07-17-2016
11:04 PM
Thanks mqureshi for reference doc and that exactly whats happened.... ............ I created new table and loaded data 6 times... which created single region for table with 6 hfiles.... total size of table was 24.2GB and 10GB region limit ran major compaction and it created 12 new regions and deleted parent region. ............ looks like, when ever split is happening.... new regions are created by formula my guess formula.... new regions added after split = ~(number of HFiles * 2) - 1 (original region is removed) or is there a actual way to get number of regions after split??
... View more
07-17-2016
08:28 PM
I have created Hbase table using below commands and splitted table into 20 hbase org.apache.hadoop.hbase.util.RegionSplitter test_rec_a UniformSplit -c 20 -f rec alter 'test_rec_a', {METHOD => 'table_att', CONFIGURATION => {'SPLIT_POLICY' => 'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy'}},
{NAME=>'rec', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROWCOL', REPLICATION_SCOPE => '0', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>'5184000', BLOCKSIZE => '65536', IN_MEMORY => 'true', BLOCKCACHE => 'true',MIN_VERSIONS => '0',KEEP_DELETED_CELLS => 'false'} enable 'test_rec_a' hbase config: max region size: 10GB
I ran 3 bulk load jobs with 2, 9, 80 GB file ... files has all unique keys I was expecting that job run and load data in all 20 regions but loaded data in single region only
is there something i am missing here??
i am looking to pre-split table into 20 regions but i don't know keys distribution as keys are hashed. is there a way to pre-split without knowing key distribution or not to pre-split is the right option?? thanks
... View more
Labels:
- Labels:
-
Apache HBase
07-14-2016
12:07 AM
thanks guys for sharing thoughts We were using IncreasingToUpperBoundRegionSplitPolicy and now changed it to ConstantSizeRegionSplitPolicy. Above solved the mystery thanks for help again!!
... View more
07-13-2016
02:28 PM
Hi Josh thanks for reply.... I have table in production that holds only 1TB of data and max.region.size is set to 10GB. i will except regions in range of 100 - 200 for this dataset but i see that number of regions are ~800 for table. ---------- I created a demo table and shared results..... in test.... Data size is 9GB less than max region size (10GB), with 5 regions....... why region can grow to 5 in first place although 1 region was good enough? no pre-splitting of table was done before major compaction, all 5 regions had data less than 10GB and no new data was added.... then why will major compaction will increase the number of regions? it should have only tried to merge Hfiles into 1 single Hfile for all 5 regions. In formation or explanation will help... thanks
... View more
07-13-2016
01:21 AM
1 Kudo
We have a table... size of table is 8.7 GB
number of regions was 5 we ran major compaction on table size increased to 21.7 GB but in some time, size came down to 8.7 GB as earlier
but
number of regions increased from 5 to 27 and then came down to 17
and then never came down to 5 again why is number of regions increased from 5 to 17 although size of data remains same??
... View more
Labels:
- Labels:
-
Apache HBase