I need to relocate a physical cluster from one data center to a new one, here some information about this scenario:
Number of racks: 4
Rack awareness: yes
HA: enabled for HDFS, YARN and HBase
Used space: ~500TB (before replication).
RF: 3 (dafault).
Average memory: 128GB
Average CPU cores: 32
Having said that let's get to the point, there's no backup and there's no way to guarantee that the nodes will continue to work well after being turned off and moved. There are nodes that have not been turned off for more than 500 days, I have no idea how their disks will support a moving. I'm saying this because the hypothetical loss of several nodes from different racks can cause data loss. Again, no backup. I need to ensure the cluster is up and running after the move without data loss.
I've been thinking in two different approaches to mitigate this situation, kinda disaster recovery plan:
BackUp all data (wich can take too much time) and I need to solve how to deal with daily aggregations/incomming data.
Create another cluster as replica and sync them (the cluster could be physical or in the cloud).
I was wondering if maybe someone could help me with this. Any suggestion is welcome.