Can anyone suggest me which is best method for taking backup of hbase data among distcp, copyTable, export/import, cluster replication?
Preference wise (as Impact on running cluster will also be very less):-
cluster replication:- If requirement is to recover in realtime and new cluster can be afforded.
export snapshot:- if recovery to last taken snapshot is fine and cost of this approach is less as you can export it to any cheap storage(hdfs,s3 or anything). But with this incremental backup will not be possible, old backups will become obsolete with the new.
HBase Snapshots allow you to take a snapshot of a table without much impact on Region Servers. Snapshot, clone, and restore operations don't involve data copying. In addition, exporting a snapshot to another cluster has no impact on region servers.