Support Questions

Find answers, ask questions, and share your expertise

hbase table copy from one cluster to other

avatar

Is there a way to copy hbase( phoenix) tables from one cluster to the other. If so can anyone tell what is the best option?

1 ACCEPTED SOLUTION

avatar
Rising Star

@ARUN,

Both the mathods "Copytable" and "Import/Export of table" are good for this but they will degrade the performance of regionserver while copying. I would preffer "Snapshot" mathod best for Backup and Recovery.

Note:- Snapshot method will only work if both cluster are of same version of Hbase. I tried it.

If your both cluster hbase versions are different then you can use Copytable method.

Snapshot method,

Go to hbase-shell and Take a snapshot of table,

=>hbase shell

=>snapshot "SOURCE_TABLE_NAME","SNAPSHOT_TABLE_NAME"

Then you can Export that snapshot to other cluster like,

=>bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot SNAPSHOT_TABLE_NAME -copy-to hdfs://DESTINATION_CLUSTER_ACTIVE_NAMENODE_ADDRESS:8020/hbase -mappers 16

After this you can restore the table on DESTINATION Cluster as,On Dest_Cluster,

=>hbase shell

=>disable "DEST_TABLENAME"

=>restore_snapshot "SNAPSHOT_TABLE_NAME"

Done your table will be copied.

View solution in original post

7 REPLIES 7

avatar

Hi @ARUN

You have several option to do this:

  • CopyTable
  • Export the table, copy the files into a new cluster and Import the table (see documentation section after CopyTable)
  • In HDP 2.5, there's a new feature of snapshooting. I am not sure if this feature is complete since I didn't try it. There's an open Jira and the backup/restore feature is listed as Tech Preview.

Note that the first two options can have an impact on the RegionServer while the third one has minimal impact.

avatar
Master Collaborator

You can take a look at:

http://hbase.apache.org/book.html#ops.snapshots

Especially:

http://hbase.apache.org/book.html#ops.snapshots.export

The backup / restore feature in HDP 2.5 makes backing up multiple tables easy to operate.

avatar
Expert Contributor

Please see below options and NOTE

NOTE : for both options CopyTable and Export/Import

Since the cluster is up, there is a risk that edits could be missed in the export process.

http://hbase.apache.org/0.94/book/ops_mgt.html#copytable

CopyTable is a utility that can copy part or of all of a table, either to the same cluster or another cluster. The usage is as follows:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] tablename

http://hbase.apache.org/0.94/book/ops_mgt.html#export

14.1.7. Export

Export is a utility that will dump the contents of table to HDFS in a sequence file. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Export <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]]

Note: caching for the input Scan is configured via hbase.client.scanner.caching in the job configuration.

14.1.8. Import

Import is a utility that will load data that has been exported back into HBase. Invoke via:

$ bin/hbase org.apache.hadoop.hbase.mapreduce.Import <tablename> <inputdir>

avatar
Rising Star

@ARUN,

Both the mathods "Copytable" and "Import/Export of table" are good for this but they will degrade the performance of regionserver while copying. I would preffer "Snapshot" mathod best for Backup and Recovery.

Note:- Snapshot method will only work if both cluster are of same version of Hbase. I tried it.

If your both cluster hbase versions are different then you can use Copytable method.

Snapshot method,

Go to hbase-shell and Take a snapshot of table,

=>hbase shell

=>snapshot "SOURCE_TABLE_NAME","SNAPSHOT_TABLE_NAME"

Then you can Export that snapshot to other cluster like,

=>bin/hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot SNAPSHOT_TABLE_NAME -copy-to hdfs://DESTINATION_CLUSTER_ACTIVE_NAMENODE_ADDRESS:8020/hbase -mappers 16

After this you can restore the table on DESTINATION Cluster as,On Dest_Cluster,

=>hbase shell

=>disable "DEST_TABLENAME"

=>restore_snapshot "SNAPSHOT_TABLE_NAME"

Done your table will be copied.

avatar
Rising Star

@AMIT,

Before using all the methods please take a backup of destination clusters Table by using Snapshot method like,

On destination cluster,

=>hbase shell

=>snapshot "DEST_TABLE_NAME","SNAPSHOT_DEST_TABLE_NAME"

So that your data on DESTINATION cluster will not be lost. To keep your data safe on Destination clutser you can use this this method. After your use you can revert it back as,

=>hbase shell

=>disable "DEST_TABLE_NAME"

=>restore_snapshot "SNAPSHOT_DEST_TABLE_NAME"

avatar

In this if we copy the Hfiles manually from one Hbase cluster to another, in that case list command dispalys all the tables, But scanning a table does not shown any data. This is because i have not copied META table enteries. So is there a way to copy META table enteries to another hbase instance in a way that both the already exisiting table and new tables exist with their data.

,

If we manually copy Hfiles from Hbase instance to another. In that case, list will display the tables but scan will not show the data because we have not copied META table. So is there a way to copy the enteries of MEta table also, in a way that new tables and already exisiting tables both are retained.

avatar

In this if we copy the Hfiles manually from one Hbase cluster to another, in that case list command dispalys all the tables, But scanning a table does not shown any data. This is because i have not copied META table enteries. So is there a way to copy META table enteries to another hbase instance in a way that both the already exisiting table and new tables exist with their data.

,

If we manually copy Hfiles from Hbase instance to another. In that case, list will display the tables but scan will not show the data because we have not copied META table. So is there a way to copy the enteries of MEta table also, in a way that new tables and already exisiting tables both are retained.