Created 03-22-2017 02:29 PM
Hi,
I have cluster of 24 nodes. I want to take backup of multiple hive tables and wanted to migrate over another cluster having 3 nodes. can anyone tell me the best way.
Created 03-22-2017 03:02 PM
1) Stop Hive on the target cluster
2) Distcp all the necessary files on HDFS to the secondary cluster.
3) Take a SQL dump of your Hive Metastore (which is in MySQL or Postgres).
4) Restore the SQL dump on your target cluster.
5) Use the Hive Metatool "-updateLocation" command on the target cluster to change the Metastore URIs
https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool
6) Start Hive on the target cluster
To make the process easier, assuming this is a one-time thing, I suggest that you copy the entire Metastore rather than trying to pick and choose certain tables. While being selective is possible, it will add a bit more complexity to your process.
Created 03-22-2017 03:02 PM
1) Stop Hive on the target cluster
2) Distcp all the necessary files on HDFS to the secondary cluster.
3) Take a SQL dump of your Hive Metastore (which is in MySQL or Postgres).
4) Restore the SQL dump on your target cluster.
5) Use the Hive Metatool "-updateLocation" command on the target cluster to change the Metastore URIs
https://cwiki.apache.org/confluence/display/Hive/Hive+MetaTool
6) Start Hive on the target cluster
To make the process easier, assuming this is a one-time thing, I suggest that you copy the entire Metastore rather than trying to pick and choose certain tables. While being selective is possible, it will add a bit more complexity to your process.
Created 03-22-2017 03:12 PM
thanks eyad.. i will check with this. thanks again
Created 03-22-2017 06:01 PM
Below posts also talks about Backing up Hive tables:
https://community.hortonworks.com/questions/78292/backup-specific-hive-table.html