Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hbase- Eliminating empty regions with an export-import approach

avatar
Contributor

We are trying to reduce the number of empty regions in a table (informs_search). This table has around 5900 regions (includes thousands of empty regions) and 8TB worth data.

With an export – import approach on a sample data (16,819,569 rows).

Backup informs serach

disable 'informs_search'

snapshot 'informs_search', 'informs_search_snpsht'

clone_snapshot 'informs_search_snpsht', 'informs_search_backup'

delete_snapshot 'informs_search_snpsht'

enable ‘informs_search’

Export informs_search

/usr/hdp/current/hbase-client/bin/hbase org.apache.hadoop.hbase.mapreduce.Export 'informs_search' /db/support/hexport/inform_search_bk 1 1 1443738964000

Truncate informs search

truncate ‘informs_search’

Import informs_search

hbase org.apache.hadoop.hbase.mapreduce.Import 'informs_search' /db/support/hexport/inform_search_bk

Observations:-

  • Before we ran these steps , we had 9 regions (6+3) across two region servers
  • After we ran these steps, we have 2 regions across 1 Region server

----------------------------------------------------------------------------------------------------------------------------------

* In Production, after running the same, would that reduce to 2 regions as well?

* IS there anyway to predict/configure the resultant number of regions and regions servers?

* Also, how many major compactions will it take so that data will be distributed across the region servers (and regions)?

1 ACCEPTED SOLUTION

avatar
Explorer

The final #regions depends on the data at hand... If you want to constrain the number of regions, you should create a table with that many (pre-split) regions, and then import into that.

View solution in original post

3 REPLIES 3

avatar
Explorer

The final #regions depends on the data at hand... If you want to constrain the number of regions, you should create a table with that many (pre-split) regions, and then import into that.

avatar
@vwunnava@hortonworks.com

Please avoid putting customer names here as this is public facing forum. editing your question for the same.

avatar

Removed customer name.