Created 05-21-2019 01:49 PM
Hi, I would like to copy data from one Hbase cluster to another Hbase cluster. While copying I should copy 1000 records as first set and 1000 records as second set...and so on. For example, first set, I should copy only 1000 records, second set, I should copy 1000. How can I achieve using commands or scripts?
Note: my Hbase table is huge. So, I would like to split and copy the data from one cluster to another (without moving to HDFS).
Created 05-22-2019 02:38 AM
I would strongly suggest you look at HBase's snapshotting model as detailed at https://hbase.apache.org/book.html#ops.snapshots. The snapshot create process is very fast as it does NOT create a copy of the underlying HFiles on HDFS (just keeps HDFS snapshot "pointers" to them). Then you can use the ExportSnapshot process that will copy the needed underlying HFiles over to the second HBase cluster. This model won't utilize any extra space on the source cluster (well, delete the snapshot once you are done!) or on the target cluster as you'll have to get all those HFiles created which is what this process does.
Good luck and happy HBasing!
Created 05-22-2019 02:38 AM
I would strongly suggest you look at HBase's snapshotting model as detailed at https://hbase.apache.org/book.html#ops.snapshots. The snapshot create process is very fast as it does NOT create a copy of the underlying HFiles on HDFS (just keeps HDFS snapshot "pointers" to them). Then you can use the ExportSnapshot process that will copy the needed underlying HFiles over to the second HBase cluster. This model won't utilize any extra space on the source cluster (well, delete the snapshot once you are done!) or on the target cluster as you'll have to get all those HFiles created which is what this process does.
Good luck and happy HBasing!