Created 09-12-2017 05:52 PM
We have a hive database on cluster 1. We want to move this hive database to cluster 2. Both clusters have other hive databases as well. For Hive backup, we normally backup the hive metadata on MYSQL and also the physical hive files (database directories with table sub-directories) and use this for restore. So for this purpose of backing up a single database we can get the particular database's data directory. But how to get the metadata for a particular database from MYSQL? Appreciate any insights.
Created 09-13-2017 12:36 PM
Hi @n c,
You can also copy the hdfs file, then re-create the hive table on the target hive and then perform a compute statistic for all the metadata.
You can get the create statement of the table by doing the following query: show create table xxxxx
Created 09-13-2017 12:36 PM
Hi @n c,
You can also copy the hdfs file, then re-create the hive table on the target hive and then perform a compute statistic for all the metadata.
You can get the create statement of the table by doing the following query: show create table xxxxx
Created 09-13-2017 01:53 PM
Hi msumbul, thats exactly what i thought as well. but hortonworks is saying it will mess up the metadata. "No, it is not tested and supported. The metadata for the tables will be totally different."
BTW is there an export capability in hive. ie can we export and import data. That way also I think we can create empty tables and import them in. Appreciate the insights.
Created 09-14-2017 09:29 AM
Hi @n c,
You don't have to copy the metadata. Copy the folder structure with all the data to the new cluster. Recreate the table and don't forget to do a compute statistic on the table. This will recreate a lot of metadata in order to have the CBO working fine.
There's no need for export tool because you can directly copy the data from HDFS 🙂
Michel
Created 09-14-2017 11:13 AM
Hi @n c
You can use Insert Overwrite Local Directory command in hive to export to the desired format and use distcp to copy the files or even the complete database in hive( which means entire files which are created under each tables in a database) into the second cluster. Once the files are moved to new cluster take the DDL for previous cluster and create the hive tables. Once its done you can either insert/copy the files into hive tables in new cluster. Hope it Helps!!
Created 09-14-2017 06:47 PM
Is there any command to show the create database command?
Created 09-15-2017 03:59 AM
No we cant get the something similar to DDL of a table in terms of database. But we can use describe database to see the other properties. Hope it helps!
Created 09-15-2017 06:18 PM
Hey all, thanks so much for the input. i went with msumbul's instructions and was able to successfully port a hive database from one cluster to another - so special thanks to msumbul!!! BTW I took out all the CREATE TABLEs out of the database. But what about other database objects? Procedures etc? Is there anyway to check for objects other than tables in a Hive database? Appreciate the insights.
Created 09-16-2017 12:18 PM
Hi @n c,
You are welcome! 🙂 . I don't think there's other object in hive (but not sure) there's the UDF for that you need to export the jar and you use for UDF in you first cluster.
May I ask you to accept my answer? 🙂
Thanks!
Michel