I would like to make sure that all data in my main table has been replicated to my secundary cluster at a specific time every day.
At the moment my strategy is to use "status 'replication'" from the hbase shell to ensure among others that "SizeOfLogQueue" = 0. But in case it is not, I would like to have an option to prioritize replication catch-up over other activities.
The ReplicationSyncUp tool seems like a viable option for this.
My question is: can the ReplicationSyncUp tool be used safely while HBase is running? and also, are there any reasons, such as excessive strain on the cluster, that I should not use the tool for this purpose?
In the Jira issue for the tool it mentions that it could be used while HBase is up, but does not give any conclusion as to whether it ended up being implemented in such a way.
Answers as well as any speculations would appreciate.