Support Questions
Find answers, ask questions, and share your expertise

how to migrate Hive data over to new cluster?

Expert Contributor

We have a Old Cluster with HDP 2.0.6 (Hive 0.12.0).. New Cluster with HDP 2.3.2 (Hive 1.2.1). We need to hive query the same tables in new environment and old environment after data forklift in same manner (implying database names, partitioning, etc. work). We migrated all the hdfs data over using distcp.

1 ACCEPTED SOLUTION

Master Collaborator

If its just a migration, you can make a copy of your old database and then point the HDP 2.3.2 cluster to the copy of your old database. Before starting the Hive services upgrade the hive database by using schemaTool. You can use metatool to update the HDFS locations to the new cluster. Start the hive services.

View solution in original post

10 REPLIES 10

Master Collaborator

If its just a migration, you can make a copy of your old database and then point the HDP 2.3.2 cluster to the copy of your old database. Before starting the Hive services upgrade the hive database by using schemaTool. You can use metatool to update the HDFS locations to the new cluster. Start the hive services.

Expert Contributor

Hi @Deepesh yes finally I was able to migrate using similar steps of db backup, restore & point. I will update the steps soon. Only applicable in clean env. Doesn't seem to be a clean approach in case there is already pre-existing database & tables in hive . We should have a way to extract ddl scripts for all tables / database either from MySQL or hive & run it on new env in hive.

@Saumil MayaniHDP2.3.2 has Hive 1.2.1? Did you mean HDP 2.2.x or HDP2.3.x?

Expert Contributor

yes @Pradeep, you are right .. It is hdp 2.3.2 & hive 1.2.1

@Saumil Mayani

If you want to get ddl of tables

mysql -u hive -p -e " select concat(  'show create table ' , TBL_NAME,';') from TBLS" hive > file.sql  
remove header in file.sql
hive -f /tmp/file.sql

Details

[root@phdns02 ~]#

Table names

login to mysql

use hive;

select concat( 'show create table ' , TBL_NAME,';') from TBLS;

or

mysql -u hive -p -e " select concat( 'show create table ' , TBL_NAME,';') from TBLS" hive > file.sql

548-screen-shot-2015-11-20-at-101152-pm.png

You can add above output in shell and run this

[root@phdns02 ~]# su - hive

[hive@phdns02 ~]$ hive -f /tmp/h.ddl

WARNING: Use "yarn jar" to launch YARN applications.

Logging initialized using configuration in file:/etc/hive/2.3.4.0-3276/0/hive-log4j.properties

OK

CREATE TABLE `hadoop`(

`name` string)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

'hdfs://phdns01.cloud.hortonworks.com:8020/apps/hive/warehouse/hadoop'

TBLPROPERTIES (

'transient_lastDdlTime'='1448062564')

Time taken: 2.045 seconds, Fetched: 12 row(s)

OK

CREATE TABLE `ns`(

`name` string)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

'hdfs://phdns01.cloud.hortonworks.com:8020/apps/hive/warehouse/ns'

TBLPROPERTIES (

'transient_lastDdlTime'='1448062211')

Time taken: 0.067 seconds, Fetched: 12 row(s)

[hive@phdns02 ~]$

Expert Contributor

Neeraj Sabharwal This is great for fetching all tables from Mysql, however, To avoid Error "FAILED: SemanticException [Error 10001]: Table not found Table Name"

The following would work.

mysql -u hive -p -e "select concat( 'show create table ' , T.NAME , '.', T.TBL_NAME,';') from (select DBS.NAME, TBLS.TBL_NAME from TBLS left join DBS on TBLS.DB_ID = DBS.DB_ID) T" hive > /tmp/file.ddl
##remove header in file.sql
hive -f /tmp/file.ddl > tmp/create_table.ddl

@Saumil Mayani nice! Def deserve upvote.

Contributor

Does this include View DDLs?

In case of oracle metastore , the query would to run from SQL Developer or TOAD would be :

SELECT 'show create table' || dbs.name || '.' || tbls.tbl_name || ';' FROM tbls LEFT JOIN dbs ON tbls.db_id = dbs.db_id

New Contributor

Do we have a way to generate the DDLs for hive databases ?

; ;