Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Copying data from One HBase to another Hbase cluster

avatar
New Contributor

Hi, I would like to copy data from one Hbase cluster to another Hbase cluster. While copying I should copy 1000 records as first set and 1000 records as second set...and so on. For example, first set, I should copy only 1000 records, second set, I should copy 1000. How can I achieve using commands or scripts?

Note: my Hbase table is huge. So, I would like to split and copy the data from one cluster to another (without moving to HDFS).

1 ACCEPTED SOLUTION

avatar

I would strongly suggest you look at HBase's snapshotting model as detailed at https://hbase.apache.org/book.html#ops.snapshots. The snapshot create process is very fast as it does NOT create a copy of the underlying HFiles on HDFS (just keeps HDFS snapshot "pointers" to them). Then you can use the ExportSnapshot process that will copy the needed underlying HFiles over to the second HBase cluster. This model won't utilize any extra space on the source cluster (well, delete the snapshot once you are done!) or on the target cluster as you'll have to get all those HFiles created which is what this process does.

Good luck and happy HBasing!

View solution in original post

1 REPLY 1

avatar

I would strongly suggest you look at HBase's snapshotting model as detailed at https://hbase.apache.org/book.html#ops.snapshots. The snapshot create process is very fast as it does NOT create a copy of the underlying HFiles on HDFS (just keeps HDFS snapshot "pointers" to them). Then you can use the ExportSnapshot process that will copy the needed underlying HFiles over to the second HBase cluster. This model won't utilize any extra space on the source cluster (well, delete the snapshot once you are done!) or on the target cluster as you'll have to get all those HFiles created which is what this process does.

Good luck and happy HBasing!