Support Questions

mqureshi · ‎02-14-2017

I have a question about HBase backup. Let's say if I am running HBase replication. Replication is master-slave. Slave is the DR site. Now imagine for some reason, the network failure occurs or let's say Slave cluster dies. Master is running just fine. Now we bring Slave up after 3 hours (or whatever number of hours). What's the best way in this case to make sure slave gets the data for those three hours?

I was thinking about copyTable using startTime and endTime but wanted to confirm if that's the right/best approach. Also, how much load will copy table create on master cluster? I read in documentation that I should be able to run copyTable from my target machine (DR in this case). Is my understanding correct and how does this alleviate any load? Is that because map reduce is running on remote cluster and reading data from a remote machine?

cskrabak · ‎02-14-2017

3 or whatever number of slave cluster downtime is not really an exceptional case for replication. According to the documents linked below, HBase and Zookeeper will collect a backlog of edits and once the slave cluster is up again, replicate the older edits the same way as newer ones. So in this normal case, best way is to let it do its job.

https://hbase.apache.org/0.94/replication.html#Normal_processing

https://hbase.apache.org/0.94/replication.html#Non-responding_slave_clusters

Abnormal cases can occur if table data gets corrupted and the replication breaks. Then you may have to copy. I don't know about the loads.

View solution in original post

cskrabak · ‎02-14-2017

3 or whatever number of slave cluster downtime is not really an exceptional case for replication. According to the documents linked below, HBase and Zookeeper will collect a backlog of edits and once the slave cluster is up again, replicate the older edits the same way as newer ones. So in this normal case, best way is to let it do its job.

https://hbase.apache.org/0.94/replication.html#Normal_processing

https://hbase.apache.org/0.94/replication.html#Non-responding_slave_clusters

Abnormal cases can occur if table data gets corrupted and the replication breaks. Then you may have to copy. I don't know about the loads.

Cloudera Community

Support Questions

What's the best to approach to bring HBase DR in sync with Master