- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Which is best method for taking backup of hbase data?
- Labels:
-
Apache HBase
Created ‎02-18-2016 07:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Can anyone suggest me which is best method for taking backup of hbase data among distcp, copyTable, export/import, cluster replication?
Created ‎02-18-2016 08:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Rushikesh Deshmukh, for a list of backup options check this. CopyTable is a nice option, using multiple mappers, you can copy individual tables to the same or another cluster. You can miss a few edits but you will end up with a useful copy.
Created ‎02-18-2016 07:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can use snapshots to take backup https://hbase.apache.org/book.html#ops.snapshots
Created ‎02-18-2016 07:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Rajeshbabu Chintaguntla, can you please explain what are advantages of using snapshots over other methods and also provide command to be used for taking snapshots?
Created ‎02-21-2016 06:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Rajeshbabu Chintaguntla, thanks for sharing this info. and link.
Created ‎02-18-2016 08:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Preference wise (as Impact on running cluster will also be very less):-
cluster replication:- If requirement is to recover in realtime and new cluster can be afforded.
export snapshot:- if recovery to last taken snapshot is fine and cost of this approach is less as you can export it to any cheap storage(hdfs,s3 or anything). But with this incremental backup will not be possible, old backups will become obsolete with the new.
Created ‎02-21-2016 06:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@ asinghal, thanks for sharing this useful information.
Created ‎02-18-2016 08:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Rushikesh,
Suggest you to go through this article might be helpful:
Regards,
Karthik Gopal
Created ‎02-18-2016 08:04 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HBase Snapshots allow you to take a snapshot of a table without much impact on Region Servers. Snapshot, clone, and restore operations don't involve data copying. In addition, exporting a snapshot to another cluster has no impact on region servers.
Created ‎02-18-2016 08:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for reply, found some useful information on given link. Does hbase_snapshots is best method for backup on live environment?
Created ‎02-18-2016 08:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Rushikesh Deshmukh, for a list of backup options check this. CopyTable is a nice option, using multiple mappers, you can copy individual tables to the same or another cluster. You can miss a few edits but you will end up with a useful copy.
