Support Questions

Find answers, ask questions, and share your expertise

Moving a hive database from one cluster to another

avatar
Rising Star

We have a hive database on cluster 1. We want to move this hive database to cluster 2. Both clusters have other hive databases as well. For Hive backup, we normally backup the hive metadata on MYSQL and also the physical hive files (database directories with table sub-directories) and use this for restore. So for this purpose of backing up a single database we can get the particular database's data directory. But how to get the metadata for a particular database from MYSQL? Appreciate any insights.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hi @n c,

You can also copy the hdfs file, then re-create the hive table on the target hive and then perform a compute statistic for all the metadata.
You can get the create statement of the table by doing the following query: show create table xxxxx

View solution in original post

8 REPLIES 8

avatar
Expert Contributor

Hi @n c,

You can also copy the hdfs file, then re-create the hive table on the target hive and then perform a compute statistic for all the metadata.
You can get the create statement of the table by doing the following query: show create table xxxxx

avatar
Rising Star

Hi msumbul, thats exactly what i thought as well. but hortonworks is saying it will mess up the metadata. "No, it is not tested and supported. The metadata for the tables will be totally different."

BTW is there an export capability in hive. ie can we export and import data. That way also I think we can create empty tables and import them in. Appreciate the insights.

avatar
Expert Contributor

Hi @n c,

You don't have to copy the metadata. Copy the folder structure with all the data to the new cluster. Recreate the table and don't forget to do a compute statistic on the table. This will recreate a lot of metadata in order to have the CBO working fine.

There's no need for export tool because you can directly copy the data from HDFS 🙂

Michel

avatar

Hi @n c

You can use Insert Overwrite Local Directory command in hive to export to the desired format and use distcp to copy the files or even the complete database in hive( which means entire files which are created under each tables in a database) into the second cluster. Once the files are moved to new cluster take the DDL for previous cluster and create the hive tables. Once its done you can either insert/copy the files into hive tables in new cluster. Hope it Helps!!

avatar
Rising Star

Is there any command to show the create database command?

avatar
@n c

No we cant get the something similar to DDL of a table in terms of database. But we can use describe database to see the other properties. Hope it helps!

avatar
Rising Star

Hey all, thanks so much for the input. i went with msumbul's instructions and was able to successfully port a hive database from one cluster to another - so special thanks to msumbul!!! BTW I took out all the CREATE TABLEs out of the database. But what about other database objects? Procedures etc? Is there anyway to check for objects other than tables in a Hive database? Appreciate the insights.

avatar
Expert Contributor

Hi @n c,

You are welcome! 🙂 . I don't think there's other object in hive (but not sure) there's the UDF for that you need to export the jar and you use for UDF in you first cluster.

May I ask you to accept my answer? 🙂

Thanks!

Michel