Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to migrate hive partitioned db to new cluster

avatar
New Contributor

hi we have dev cluster with 5 nodes and prod cluster with 5 nodes boath with hive installed, now i want to migrate partitioned hive tables from dev to prod cluster,

can someone help me how to properly migrate tables and metastore to prod cluster.

Thanks in advance.

1 ACCEPTED SOLUTION

avatar
Master Collaborator

@raja reddy

You can copy the HDFS files from your dev cluster to prod cluster, then you can re-create the hive table on the prod cluster and then perform a compute statistic for all the metadata like MSCK REPAIR TABLE command. For re-creating the hive tables, you can get the create statement of the table by doing the show create table <table_name> query in your dev cluster.

Following are the high-level steps involved in a Hive migration

Suppose if clusters are Kerberized then you can refer below links for distcp.

https://community.hortonworks.com/content/supportkb/151079/configure-distcp-between-two-clusters-wit...

Note: There's no need for export because you can directly copy the data from HDFS between both clusters.

Please accept the answer you found most useful

View solution in original post

1 REPLY 1

avatar
Master Collaborator

@raja reddy

You can copy the HDFS files from your dev cluster to prod cluster, then you can re-create the hive table on the prod cluster and then perform a compute statistic for all the metadata like MSCK REPAIR TABLE command. For re-creating the hive tables, you can get the create statement of the table by doing the show create table <table_name> query in your dev cluster.

Following are the high-level steps involved in a Hive migration

Suppose if clusters are Kerberized then you can refer below links for distcp.

https://community.hortonworks.com/content/supportkb/151079/configure-distcp-between-two-clusters-wit...

Note: There's no need for export because you can directly copy the data from HDFS between both clusters.

Please accept the answer you found most useful