My question is, after using DistCp to transfert all the HDFS data to a second cluster, can I replicate the whole hive metastore database (manualy or using DB HA..) to accomplish the backup/restore ?or I should import/export only some specefic hive metastore tables ?
Are you using The BDR tool available with a cloudera Enterprise license?
If that is the case then you probably should be using that tool and two separate Hive Metastores. No DB copying required. (I am trying this now - I am no expert yet :-)
If you don't then you might consider having a shared metastore. Does that work for you?
But finally if all you are doing is creating a Disaster Recovery type backup then I would assume you need all the tables in the Hive Metastore. But that is a guess.
I'm not using the BDR.
I want to make a hot backup cluster, so I'll test to replicate the whole metastore database every X minutes and see the results.
Thanks for your replies.
If you cannot use BDR Hive replication (https://www.cloudera.com/documentation/enterprise/latest/topics/cm_bdr_hive_replication.html)
Then one option to replicate is to copy the database. I know it has been done but I'm not sure of the steps.
I would recommend inquiring in a new thread on the Hive Community message board.