Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Guru

GOAL

  • Migrate hive data and metadata
  • Ensure that metadata is updated based on the target clusters Hive metastore version
  • I have used tpch data for demonstration purposes

Steps (Performed on Old Cluster)

  • Take a note of all the directories you need to copy from old to new cluster on HDFS
[hive@xlnode-standalone datagen]$ hdfs dfs -ls /apps/hive/warehouse
Found 1 items
drwxrwxrwx   - hive hdfs          0 2017-03-01 23:31 /apps/hive/warehouse/tpch_text_2.db
[hive@xlnode-standalone datagen]$ beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p ''
WARNING: Use "yarn jar" to launch YARN applications.
Connecting to jdbc:hive2://localhost:10000/default
Connected to: Apache Hive (version 1.2.1000.2.4.3.0-227)
Driver: Hive JDBC (version 1.2.1000.2.4.3.0-227)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.4.3.0-227 by Apache Hive
0: jdbc:hive2://localhost:10000/default> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| tpch_text_2    |
+----------------+--+
2 rows selected (0.143 seconds)
0: jdbc:hive2://localhost:10000/default> use tpch_text_2;
No rows affected (0.06 seconds)
0: jdbc:hive2://localhost:10000/default> show tables;
+-----------+--+
| tab_name  |
+-----------+--+
| customer  |
| lineitem  |
| nation    |
| orders    |
| part      |
| partsupp  |
| region    |
| supplier  |
+-----------+--+
8 rows selected (0.069 seconds)
  • Ensure that the count of data from one/more table(s) for validation
0: jdbc:hive2://localhost:10000/default> select count(*) from lineitem;
INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: select count(*) from lineitem(Stage-1)
INFO  : 
INFO  : Status: Running (Executing on YARN cluster with App id application_1488410723867_0003)

INFO  : Map 1: -/-	Reducer 2: 0/1	
INFO  : Map 1: 0/6	Reducer 2: 0/1	
INFO  : Map 1: 0/6	Reducer 2: 0/1	
INFO  : Map 1: 0(+2)/6	Reducer 2: 0/1	
INFO  : Map 1: 0(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 0(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 1(+2)/6	Reducer 2: 0/1	
INFO  : Map 1: 1(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 2(+2)/6	Reducer 2: 0/1	
INFO  : Map 1: 2(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 3(+2)/6	Reducer 2: 0/1	
INFO  : Map 1: 3(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 3(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 3(+3)/6	Reducer 2: 0/1	
INFO  : Map 1: 4(+2)/6	Reducer 2: 0(+1)/1	
INFO  : Map 1: 5(+1)/6	Reducer 2: 0(+1)/1	
INFO  : Map 1: 6/6	Reducer 2: 0(+1)/1	
INFO  : Map 1: 6/6	Reducer 2: 1/1	
+-----------+--+
|    _c0    |
+-----------+--+
| 11997996  |
+-----------+--+
1 row selected (24.331 seconds)
0: jdbc:hive2://localhost:10000/default> 
  • Identify the data to be copied over
[hdfs@xlnode-standalone ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r--   3 hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/customer
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/lineitem
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/nation
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/orders
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/part
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/partsupp
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:30 /tmp/tpch-generate/2/region
drwxr-xr-x   - hive hdfs          0 2017-03-01 23:31 /tmp/tpch-generate/2/supplier


[hdfs@xlnode-standalone ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G  /tmp/tpch-generate/2

  • Identify the metastore database server and dump the hive metadata database
[hive@xlnode-standalone ~]$ mysqldump hive -u hive -p > hive.dump

Steps (Performed on New Cluster)

  • When copying the data from a non-secure to secure cluster, we can use this. Run this command as the hive user
hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo
  • Monitor the output to check if there are any failures
[hive@xlnode-3 ~]$ hadoop distcp hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate hdfs://xlnode-1.hwx.com:8020/tmp/
17/03/02 04:55:11 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate], targetPath=hdfs://xlnode-1.hwx.com:8020/tmp, targetPathExists=true, filtersFile='null'}
17/03/02 04:55:12 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:12 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:12 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:12 INFO hdfs.DFSClient: Cannot get delegation token from rm/xlnode-2.hwx.com@HWX.COM
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 25; dirCnt = 10
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Build file listing completed.
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:13 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:13 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:13 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 105 for hive on 172.26.94.233:8020
17/03/02 04:55:13 INFO security.TokenCache: Got dt for hdfs://xlnode-1.hwx.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: number of splits:10
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1488429683114_0003
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:15 INFO impl.YarnClientImpl: Submitted application application_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: The url to track the job: http://xlnode-2.hwx.com:8088/proxy/application_1488429683114_0003/
17/03/02 04:55:15 INFO tools.DistCp: DistCp job-id: job_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: Running job: job_1488429683114_0003
17/03/02 04:55:25 INFO mapreduce.Job: Job job_1488429683114_0003 running in uber mode : false
17/03/02 04:55:25 INFO mapreduce.Job:  map 0% reduce 0%
17/03/02 04:55:33 INFO mapreduce.Job:  map 10% reduce 0%
17/03/02 04:55:34 INFO mapreduce.Job:  map 20% reduce 0%
17/03/02 04:55:37 INFO mapreduce.Job:  map 30% reduce 0%
17/03/02 04:55:38 INFO mapreduce.Job:  map 40% reduce 0%
17/03/02 04:55:39 INFO mapreduce.Job:  map 50% reduce 0%
17/03/02 04:55:40 INFO mapreduce.Job:  map 60% reduce 0%
17/03/02 04:55:42 INFO mapreduce.Job:  map 80% reduce 0%
17/03/02 04:55:45 INFO mapreduce.Job:  map 90% reduce 0%
17/03/02 04:55:48 INFO mapreduce.Job:  map 100% reduce 0%
17/03/02 04:55:56 INFO mapreduce.Job: Job job_1488429683114_0003 completed successfully
17/03/02 04:55:57 INFO mapreduce.Job: Counters: 33
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=1492170
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=2217305398
		HDFS: Number of bytes written=2217296726
		HDFS: Number of read operations=233
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=60
	Job Counters 
		Launched map tasks=10
		Other local map tasks=10
		Total time spent by all maps in occupied slots (ms)=103725
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=103725
		Total vcore-milliseconds taken by all map tasks=103725
		Total megabyte-milliseconds taken by all map tasks=106214400
	Map-Reduce Framework
		Map input records=25
		Map output records=0
		Input split bytes=1150
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=2385
		CPU time spent (ms)=90360
		Physical memory (bytes) snapshot=3420536832
		Virtual memory (bytes) snapshot=46212300800
		Total committed heap usage (bytes)=2546466816
	File Input Format Counters 
		Bytes Read=7522
	File Output Format Counters 
		Bytes Written=0
	org.apache.hadoop.tools.mapred.CopyMapper$Counter
		BYTESCOPIED=2217296726
		BYTESEXPECTED=2217296726
		COPY=25
  • Validate if the data has been copied over succesfully
[hive@xlnode-3 ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r--   3 hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/customer
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/lineitem
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/nation
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/orders
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/part
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/partsupp
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/region
drwxr-xr-x   - hive hdfs          0 2017-03-02 04:55 /tmp/tpch-generate/2/supplier
[hive@xlnode-3 ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G  /tmp/tpch-generate/2


  • Assuming that HDP 2.5.3 is a new cluster (without any objects) and includes mysql as the metadata database
    • Stop the Metastore Process
    • Restore the backup from old clusters hive.dump to mysql database
[root@xlnode-3 ~]# mysql -u hive -D hive -p < /tmp/hive.dump 
Enter password: 
  • Because the hive dump was obtained from old cluster, it still might have the old hdfs location(s) for tables, partitions, use the following/similar metatool command to update the location
[hive@xlnode-3 ~]$ export HIVE_CONF_DIR=/etc/hive/2.5.3.0-37/0/conf.server; hive --service metatool -updateLocation hdfs://xlnode-standalone.hwx.com:8020 hdfs://xlnode-3.hwx.com:8020 -tablePropKey avro.schema.url -serdePropKey avro.schema.url
Initializing HiveMetaTool..
17/03/02 05:35:54 INFO metastore.ObjectStore: ObjectStore, initialize called
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.fixedDatastore unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/03/02 05:35:55 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Database,Type,FieldSchema,Order"
17/03/02 05:35:57 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
17/03/02 05:35:57 INFO metastore.ObjectStore: Initialized ObjectStore
Looking for LOCATION_URI field in DBS table to update..
Successfully updated the following locations..
Updated 0 records in DBS table
Looking for LOCATION field in SDS table to update..
Successfully updated the following locations..
Updated 0 records in SDS table
Looking for value of avro.schema.url key in TABLE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in TABLE_PARAMS table
Looking for value of avro.schema.url key in SD_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SD_PARAMS table
Looking for value of avro.schema.url key in SERDE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SERDE_PARAMS table


NOTE If this command fails to update the records/locations in SDS and DBS tables within hive metastore database, then you can perform the updates manually, however, it is STRONGLY recommended that you take a backup of the database at its current state.

mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT                             | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION                                                           | NUM_BUCKETS | OUTPUT_FORMAT                                              | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
|     1 |     1 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/lineitem |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        1 |
|     2 |     2 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/part     |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        2 |
|     3 |     3 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/supplier |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        3 |
|     4 |     4 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/partsupp |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        4 |
|     5 |     5 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/nation   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        5 |
|     6 |     6 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/region   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        6 |
|     7 |     7 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/customer |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        7 |
|     8 |     8 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/orders   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        8 |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)


mysql> create table SDS_BKP as select * from SDS;
Query OK, 8 rows affected (0.07 sec)
Records: 8  Duplicates: 0  Warnings: 0


mysql> begin;
Query OK, 0 rows affected (0.00 sec)


mysql> update SDS SET LOCATION = REPLACE(LOCATION,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE LOCATION LIKE '%xlnode-standalone.hwx.com%';
Query OK, 8 rows affected (0.00 sec)
Rows matched: 8  Changed: 8  Warnings: 0


mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT                             | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION                                                  | NUM_BUCKETS | OUTPUT_FORMAT                                              | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
|     1 |     1 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/lineitem |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        1 |
|     2 |     2 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/part     |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        2 |
|     3 |     3 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/supplier |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        3 |
|     4 |     4 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/partsupp |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        4 |
|     5 |     5 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/nation   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        5 |
|     6 |     6 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/region   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        6 |
|     7 |     7 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/customer |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        7 |
|     8 |     8 | org.apache.hadoop.mapred.TextInputFormat |               |                           | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/orders   |          -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |        8 |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)


mysql> 


mysql> commit;
Query OK, 0 rows affected (0.04 sec)

mysql> select * from DBS;
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                                          | NAME        | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
|     1 | Default Hive database | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse                | default     | public     | ROLE       |
|     2 | NULL                  | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive       | USER       |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)


mysql> commit;
Query OK, 0 rows affected (0.04 sec)


mysql> create table DBS_BKP as select * from DBS;
Query OK, 2 rows affected (0.04 sec)
Records: 2  Duplicates: 0  Warnings: 0


mysql> begin;
Query OK, 0 rows affected (0.00 sec)


mysql> update DBS SET DB_LOCATION_URI = REPLACE(DB_LOCATION_URI,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE DB_LOCATION_URI LIKE '%xlnode-standalone.hwx.com%';
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2  Changed: 2  Warnings: 0


mysql> select * from DBS;
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC                  | DB_LOCATION_URI                                                 | NAME        | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
|     1 | Default Hive database | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse                | default     | public     | ROLE       |
|     2 | NULL                  | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive       | USER       |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)


mysql> commit;
Query OK, 0 rows affected (0.03 sec)


mysql> 


  • Next step is to perform the metastore upgrade as the dump was obtained from HDP 2.4.3 (Hive 1.2.1) and there, are/may have, additional objects which were introduced in the new version HDP 2.5.3 (Hive 2.1.0).
[hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose

NOTE This step can produce some challenges, for instance, this tool tries to CREATE some INDEXEs that may already exist, here are two errors which I encountered:

[hive@xlnode-3 bin]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :	 com.mysql.jdbc.Driver
Metastore connection User:	 hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
|                                                 |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6)  NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.003 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
Error: Duplicate key name 'CONSTRAINTS_PARENT_TABLE_ID_INDEX' (state=42000,code=1061)


Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
	at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
	at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
	at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
	... 8 more
*** schemaTool failed ***

Above error is generated because our upgrade command is attempting to CREATE an INDEX which already exists, here is how we locate the script. From the above command, look for the main script file name "Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql" because it is a nested script and the scripts are usually located in a predefined place.

/usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql

[root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/upgrade-2.0.0-to-2.1.0.mysql.sql
SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' ';
SOURCE 034-HIVE-13076.mysql.sql;
SOURCE 035-HIVE-13395.mysql.sql;
SOURCE 036-HIVE-13354.mysql.sql;

UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1;
SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' ';

The source of our issue is actually from another child script "034-HIVE-13076.mysql.sql" , review its contents and introduce a DROP INDEX command. The script looked something like this:

[root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/034-HIVE-13076.mysql.sql 
CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS`
(
  `CHILD_CD_ID` BIGINT,
  `CHILD_INTEGER_IDX` INT(11),
  `CHILD_TBL_ID` BIGINT,
  `PARENT_CD_ID` BIGINT NOT NULL,
  `PARENT_INTEGER_IDX` INT(11) NOT NULL,
  `PARENT_TBL_ID` BIGINT NOT NULL,
  `POSITION` BIGINT NOT NULL,
  `CONSTRAINT_NAME` VARCHAR(400) NOT NULL,
  `CONSTRAINT_TYPE` SMALLINT(6)  NOT NULL,
  `UPDATE_RULE` SMALLINT(6),
  `DELETE_RULE` SMALLINT(6),
  `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL,
  PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS;   <<<<<< Manually added
CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE;

You might see another error stating that the table WRITE_SET already exists, once you rerun the upgradeSchema command after above changes. This table was introduced in HDP version greater than 2.4.3 (Hive 1.2.1)

[hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :	 com.mysql.jdbc.Driver
Metastore connection User:	 hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
|                                                 |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6)  NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.002 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.188 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.075 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Error: Table 'WRITE_SET' already exists (state=42S01,code=1050)


Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
	at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
	at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
	at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
	at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
	... 8 more
*** schemaTool failed ***


Verify that the WRITE_SET table in the current database is empty, if it is, then just add a line at the top to drop this table

[root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/035-HIVE-13395.mysql.sql 
DROP TABLE WRITE_SET;
CREATE TABLE WRITE_SET (
  WS_DATABASE varchar(128) NOT NULL,
  WS_TABLE varchar(128) NOT NULL,
  WS_PARTITION varchar(767),
  WS_TXNID bigint NOT NULL,
  WS_COMMIT_ID bigint NOT NULL,
  WS_OPERATION_TYPE char(1) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;


ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1);
  • Run the upgradeSchema command again and you should see an output similar to this upon completion
[hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:	 jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :	 com.mysql.jdbc.Driver
Metastore connection User:	 hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
|                                                 |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0  |
+-------------------------------------------------+--+
1 row selected (0.018 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6)  NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.087 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.061 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP TABLE WRITE_SET
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.052 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1)
No rows affected (0.05 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPACTION_QUEUE ADD CQ_TBLPROPERTIES varchar(2048)
No rows affected (0.091 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPLETED_COMPACTIONS ADD CC_TBLPROPERTIES varchar(2048)
No rows affected (0.043 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1
1 row affected (0.016 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+----------------------------------------------------------+--+
|                                                          |
+----------------------------------------------------------+--+
| Finished upgrading MetaStore schema from 2.0.0 to 2.1.0  |
+----------------------------------------------------------+--+
1 row selected (0.001 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> !closeall
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
beeline> 
beeline> 
Completed upgrade-2.0.0-to-2.1.0.mysql.sql
schemaTool completed
[hive@xlnode-3 bin]$ 
  • Restart the Metastore Process via Ambari and validate bother version and data for tables
    • MySQL
    mysql> select * from VERSION;
    +--------+----------------+----------------------------+
    | VER_ID | SCHEMA_VERSION | VERSION_COMMENT            |
    +--------+----------------+----------------------------+
    |      1 | 2.1.0          | Hive release version 2.1.0 |
    +--------+----------------+----------------------------+
    1 row in set (0.00 sec)
    mysql> 
    
    • Hive Database
    0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> use tpch_text_2;
    No rows affected (0.146 seconds)
    0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> show tables;
    +-----------+--+
    | tab_name  |
    +-----------+--+
    | customer  |
    | lineitem  |
    | nation    |
    | orders    |
    | part      |
    | partsupp  |
    | region    |
    | supplier  |
    +-----------+--+
    8 rows selected (0.147 seconds)
    0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> select count(*) from lineitem;
    INFO  : Tez session hasn't been created yet. Opening session
    INFO  : Dag name: select count(*) from lineitem(Stage-1)
    INFO  : 
    
    
    INFO  : Status: Running (Executing on YARN cluster with App id application_1488431838473_0001)
    
    
    INFO  : Map 1: -/-	Reducer 2: 0/1	
    INFO  : Map 1: 0/12	Reducer 2: 0/1	
    INFO  : Map 1: 0(+2)/12	Reducer 2: 0/1	
    INFO  : Map 1: 0(+2)/12	Reducer 2: 0/1	
    INFO  : Map 1: 2(+2)/12	Reducer 2: 0/1	
    INFO  : Map 1: 2(+3)/12	Reducer 2: 0/1	
    INFO  : Map 1: 2(+5)/12	Reducer 2: 0/1	
    INFO  : Map 1: 2(+6)/12	Reducer 2: 0/1	
    INFO  : Map 1: 4(+4)/12	Reducer 2: 0/1	
    INFO  : Map 1: 4(+6)/12	Reducer 2: 0/1	
    INFO  : Map 1: 5(+5)/12	Reducer 2: 0/1	
    INFO  : Map 1: 5(+6)/12	Reducer 2: 0/1	
    INFO  : Map 1: 6(+5)/12	Reducer 2: 0/1	
    INFO  : Map 1: 8(+4)/12	Reducer 2: 0/1	
    INFO  : Map 1: 9(+3)/12	Reducer 2: 0(+1)/1	
    INFO  : Map 1: 10(+2)/12	Reducer 2: 0(+1)/1	
    INFO  : Map 1: 11(+1)/12	Reducer 2: 0(+1)/1	
    INFO  : Map 1: 12/12	Reducer 2: 0(+1)/1	
    INFO  : Map 1: 12/12	Reducer 2: 1/1	
    +-----------+--+
    |    _c0    |
    +-----------+--+
    | 11997996  |
    +-----------+--+
    1 row selected (30.208 seconds)
    0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> 
    
2,183 Views
Comments
avatar
Contributor

Good article and is still relevant for me.

I would also add the following for Kudu tables

create table hive.TABLE_PARAMS_BKUP as select * from hive.TABLE_PARAMS ;

UPDATE hive.TABLE_PARAMS 
SET PARAM_VALUE = 'masterhostname01.fqdn,masterhostname02.fqdn,masterhostname03.fqdn'
WHERE PARAM_KEY = 'kudu.master_addresses'
and PARAM_VALUE like '%oldmasterhostname%';

 And also for Sentry URI's that contain the HDFS namespace

create table sentry.SENTRY_DB_PRIVILEGE_BKUP as select * from sentry.SENTRY_DB_PRIVILEGE ;

UPDATE sentry.SENTRY_DB_PRIVILEGE
SET URI = REPLACE(URI, 'hdfs://oldcluster-ns', 'hdfs://newcluster-ns/')
WHERE URI like '%oldcluster-ns%';