Support Questions

Find answers, ask questions, and share your expertise

Hbase export_snapshot between different major versions.

avatar
Explorer

As per the accepted solution posted in this question, export_snapshot is not supported ( or may not work ) between major versions of Hbase. However, this CDP documentation mentions the supportability of export_snapshot from CDH 5 (Hbase 1.2.0) to CDP 7 (HBase 2.2.3). 

 

We are working on migrating Hbase tables from Hbase 1.2 to Hbase 2 and would like to know if "export_snapshot" work in this scenario. 

1 ACCEPTED SOLUTION

avatar
Super Collaborator

Hello @Krisssh 

 

Yes, You can perform an export_snapshot from  CDH 5 to CDP 7. In CDH5, the default Hfiles version is 2 and in CDP, it's 3. So the snapshotted files will remain in version 2 in CDP and when the tables are restored from the snapshot, the snapshot will gradually have the snapshot files rewritten into Version 3 files, as the new data is ingested and compactions will rewrite data originally in the snapshot Version 2 file into the new Version 3 files for the table.

View solution in original post

5 REPLIES 5

avatar
Super Collaborator

Hello @Krisssh 

 

Yes, You can perform an export_snapshot from  CDH 5 to CDP 7. In CDH5, the default Hfiles version is 2 and in CDP, it's 3. So the snapshotted files will remain in version 2 in CDP and when the tables are restored from the snapshot, the snapshot will gradually have the snapshot files rewritten into Version 3 files, as the new data is ingested and compactions will rewrite data originally in the snapshot Version 2 file into the new Version 3 files for the table.

avatar
Explorer

@rki_ Thank you for the reply. 

Though CDH 6 and CDP 7 use Hbase 2, why is the behavior different for CDP? 

avatar
Super Collaborator

Hello @Krisssh , I don't think there should be any challenge exporting the snapshot from CDH5 to CDH6 just like we export it to CDP. So I believe the solution here is not very accurate when it comes to exporting snapshot between major versions. 

0) Cluster versions
$ hbase version
HBase 1.2.0-cdh5.17.0-SNAPSHOT

$ hbase version
HBase 2.1.0-cdh6.3.2

0.1) Make sure there's no inconsistencies reported for the target table

1) Take a snapshot in C5
hbase> count 'cluster_test'
100000 row(s) in 16.2700 seconds

hbase> snapshot 'cluster_test', 'cluster_test-ss1'

2) Run the ExportSnapshot tool in C5 
$ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot cluster_test-ss1 -copy-to hdfs://host-10-17-xx.xx:8020/hbase

3) Clone the snapshot in C6
hbase> list_snapshots <=== returns available snapshots
hbase> clone_snapshot 'cluster_test-ss1','cluster_test-new'

4) Confirm the data in C6
hbase> count 'cluster_test-new'
100000 row(s)

5) Run major compaction against the cloned table in C6H  

 

avatar
Explorer

Thank you @rki_ , for confirming the solution works for migration between C5 and C6. 

In the example you posted,

 

1. Can you confirm the exported snapshot path '/hbase' is a hbase.root directory ? Or should it be '/hbase'

always ? 

2. Similar steps should work for any Hbase 1 to Hbase 2 , example HDP to CDP, right ? 

avatar
Super Collaborator

Hi @Krisssh 

 

1. '/hbase' is a hbase.root directory

2. The steps should work for HDP3 to CDP, I haven't checked for HDP2 to CDP. There are few known issues when migrating phoenix tables from HDP2 to CDP and it should follow a path HDP2=> HDP3 => CDP. For Hbase tables, as per this it has worked.

 

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.