Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

HBase snapshot or backup ?

Rising Star

Hi,

I just spent couples of minutes proceeding with the tutorial about hbase backup & restore :

https://fr.hortonworks.com/hadoop-tutorial/introduction-apache-hbase-concepts-apache-phoenix-new-bac...

However, I cannot figure out what is really new in this as it was already possible to take snapshots of tables and restore these snapshots...What is the difference between taking a snapshot and performing a backup (except that backup operation do copy full data whereas snapshot is just a "pointer" to some table previous state) ?

2 REPLIES 2

Guru

Snapshots of hbase tables first appeared in version 0.94.6: http://hbase.apache.org/0.94/book/ops.snapshots.html

Snapshots are pointers to data at a moment in time with markers to prevent their deletion (and identify the moment in time). The have great advantages because they do not copy data, which places stress on region servers. Note though that if you export a snapshot to a different cluster the data will be copied.

Keep in mind that snapshot vs backup advantages are all about what you are trying to do.

Snapshots are great for recovering from user error but not from hardware error / Disaster Recovery (DR).

Backups work for DR but full batch backups involve large amounts of copying and stress region servers as mentioned.

Incremental Backups are best for DR because it ensures full backup but only of changes since last incremental backup.

See this for an excellent discussion of the matter:

https://hortonworks.com/blog/coming-hdp-2-5-incremental-backup-restore-apache-hbase-apache-phoenix/

Super Guru
@Sebastien Chausson

The new feature is Incremental backup. You get one snapshot and create your baseline. After that, with this new feature, incremental backups will be made using WAL since last backup. This way you are not making full copy and moving it across cluster to DR cluster. Only incremental data is moved. Following link has real good discussion adds.

https://issues.apache.org/jira/browse/HBASE-7912about this feature and all questions answered on what this feature really

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.