Member since
05-10-2016
184
Posts
60
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4090 | 05-06-2017 10:21 PM | |
4105 | 05-04-2017 08:02 PM | |
5012 | 12-28-2016 04:49 PM | |
1242 | 11-11-2016 08:09 PM | |
3332 | 10-22-2016 03:03 AM |
04-14-2017
02:58 PM
2 Kudos
Steps to Create Table in Hive on S3A with Ranger Create a bucket with a unique name, I've used "myhivebucket" and do not change any details in the permissions Complete the "Create bucket" wizard by clicking on "create bucket" button Make the following entries in custom hdfs-site.xml
fs.s3a.access.key = <access key> fs.s3a.secret.key = <access secret> fs.s3a.impl = org.apache.hadoop.fs.s3a.S3AFileSystem To retrieve the value for access key and secret, follow these steps:
Login to https://aws.amazon.com/console Click on "Sign in to the console" tab Login with appropriate credentials Once logged in, you should see your login name on the top right corner of the AWS page Click on the drop-down arrow beside your login name and select "My Security Credentials" This should take you to a page titled "Your Security Credentials" From this page, collapse the option that says "Access Keys(Access Key ID and Secret Access Key)" You have to click on "create a new access key" because of Amazon limitation described here This lets you download the key/secret in this format (this is not case sensitive) AWSAccessKeyId=XXXXXXXXXXXXXXXXXXXXX AWSSecretKey=XXXXXxxxxxXXXXXxxxxxXXXXX/xxxxx Value for "fs.s3a.access.key" will be the value for "AWSAccessKeyId" Value for "fs.s3a.secret.key" will be the value for "AWSSecretKey" Login to ranger admin interface and create a policy for hive/desired user to allow the desired permissions Now login to hive with the kerberos credentials, as required via beeline and create table ensuring that the location is on s3a [hive@xlnode-standalone ~]$ beeline -u "jdbc:hive2://xlnode-standalone.hwx.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
WARNING: Use "yarn jar" to launch YARN applications.
Connecting to jdbc:hive2://xlnode-standalone.hwx.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
Connected to: Apache Hive (version 1.2.1000.2.4.3.0-227)
Driver: Hive JDBC (version 1.2.1000.2.4.3.0-227)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.4.3.0-227 by Apache Hive
0: jdbc:hive2://xlnode-standalone.hwx.com:218> create table mys3test (col1 int, col2 string) row format delimited fields terminated by ',' stored as textfile location 's3a://myhivebucket/test';
No rows affected (12.04 seconds)
0: jdbc:hive2://xlnode-standalone.hwx.com:218>
Now try and insert some rows 0: jdbc:hive2://xlnode-standalone.hwx.com:218> insert into mys3test values (1,'test'),(2,'test');
Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [hive] does not have [UPDATE] privilege on [default/mys3test] (state=42000,code=40000)
0: jdbc:hive2://xlnode-standalone.hwx.com:218> The above error is intentional, since we do not have "UPDATE" privilege assigned via ranger, we cannot insert the values yet, allow the permission and INSERT again Validate INSERT/UPDATE and SELECT 0: jdbc:hive2://xlnode-standalone.hwx.com:218> insert into mys3test values (1,'test'),(2,'test');
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: insert into mys3test ...1,'test'),(2,'test')(Stage-1)
INFO :
INFO : Status: Running (Executing on YARN cluster with App id application_1492107639289_0002)
INFO : Map 1: -/-
INFO : Map 1: 0/1
INFO : Map 1: 0(+1)/1
INFO : Map 1: 0(+1)/1
INFO : Map 1: 0(+1)/1
INFO : Map 1: 0(+1)/1
INFO : Map 1: 0/1
INFO : Map 1: 1/1
INFO : Loading data to table default.mys3test from s3a://myhivebucket/test/.hive-staging_hive_2017-04-13_19-27-13_226_6105571528298793138-1/-ext-10000
INFO : Table default.mys3test stats: [numFiles=1, numRows=2, totalSize=14, rawDataSize=12]
No rows affected (53.854 seconds)
0: jdbc:hive2://xlnode-standalone.hwx.com:218> select * from mys3test;
+----------------+----------------+--+
| mys3test.col1 | mys3test.col2 |
+----------------+----------------+--+
| 1 | test |
| 2 | test |
+----------------+----------------+--+
2 rows selected (3.554 seconds)
0: jdbc:hive2://xlnode-standalone.hwx.com:218>
... View more
Labels:
04-13-2017
11:19 PM
@aakash bhatt Try downloading the library "yum install libtirpc-devel -y", seems like that is whats is complaining about
... View more
04-05-2017
03:30 AM
@Rajesh Babu Devabhaktuni Try stripping the command and use less options. Try and break it in to chunks to see where the problem is.
... View more
03-14-2017
09:23 PM
GOALS
Configure Ranger + Ranger KMS
Create an encryption Zone
OS used is CentOS/Redhat 6.6
At the end, should be able to create an encryption zone and validate using Hive
NOTE: This article is in a walkthrough mode wherein snapshots were taken from relevant screens to create a step-by-step guide on installing Ranger and Ranger KMS along with creation and validation of encrypted zones in Hadoop.
https://www.scribd.com/presentation/341887712/HDFS-Encryption-Zone-Hive-Orig
... View more
03-03-2017
03:18 AM
1 Kudo
GOAL
Migrate hive data and metadata Ensure that metadata is updated based on the target clusters Hive metastore version I have used tpch data for demonstration purposes Steps (Performed on Old Cluster) Take a note of all the directories you need to copy from old to new cluster on HDFS [hive@xlnode-standalone datagen]$ hdfs dfs -ls /apps/hive/warehouse
Found 1 items
drwxrwxrwx - hive hdfs 0 2017-03-01 23:31 /apps/hive/warehouse/tpch_text_2.db
[hive@xlnode-standalone datagen]$ beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p ''
WARNING: Use "yarn jar" to launch YARN applications.
Connecting to jdbc:hive2://localhost:10000/default
Connected to: Apache Hive (version 1.2.1000.2.4.3.0-227)
Driver: Hive JDBC (version 1.2.1000.2.4.3.0-227)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.4.3.0-227 by Apache Hive
0: jdbc:hive2://localhost:10000/default> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| tpch_text_2 |
+----------------+--+
2 rows selected (0.143 seconds)
0: jdbc:hive2://localhost:10000/default> use tpch_text_2;
No rows affected (0.06 seconds)
0: jdbc:hive2://localhost:10000/default> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+-----------+--+
8 rows selected (0.069 seconds)
Ensure that the count of data from one/more table(s) for validation 0: jdbc:hive2://localhost:10000/default> select count(*) from lineitem;
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: select count(*) from lineitem(Stage-1)
INFO :
INFO : Status: Running (Executing on YARN cluster with App id application_1488410723867_0003)
INFO : Map 1: -/- Reducer 2: 0/1
INFO : Map 1: 0/6 Reducer 2: 0/1
INFO : Map 1: 0/6 Reducer 2: 0/1
INFO : Map 1: 0(+2)/6 Reducer 2: 0/1
INFO : Map 1: 0(+3)/6 Reducer 2: 0/1
INFO : Map 1: 0(+3)/6 Reducer 2: 0/1
INFO : Map 1: 1(+2)/6 Reducer 2: 0/1
INFO : Map 1: 1(+3)/6 Reducer 2: 0/1
INFO : Map 1: 2(+2)/6 Reducer 2: 0/1
INFO : Map 1: 2(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+2)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 4(+2)/6 Reducer 2: 0(+1)/1
INFO : Map 1: 5(+1)/6 Reducer 2: 0(+1)/1
INFO : Map 1: 6/6 Reducer 2: 0(+1)/1
INFO : Map 1: 6/6 Reducer 2: 1/1
+-----------+--+
| _c0 |
+-----------+--+
| 11997996 |
+-----------+--+
1 row selected (24.331 seconds)
0: jdbc:hive2://localhost:10000/default>
Identify the data to be copied over [hdfs@xlnode-standalone ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r-- 3 hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/customer
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/lineitem
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/nation
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/orders
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/part
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/partsupp
drwxr-xr-x - hive hdfs 0 2017-03-01 23:30 /tmp/tpch-generate/2/region
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/supplier
[hdfs@xlnode-standalone ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G /tmp/tpch-generate/2
Identify the metastore database server and dump the hive metadata database [hive@xlnode-standalone ~]$ mysqldump hive -u hive -p > hive.dump Steps (Performed on New Cluster) When copying the data from a non-secure to secure cluster, we can use this. Run this command as the hive user hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo Monitor the output to check if there are any failures [hive@xlnode-3 ~]$ hadoop distcp hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate hdfs://xlnode-1.hwx.com:8020/tmp/
17/03/02 04:55:11 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate], targetPath=hdfs://xlnode-1.hwx.com:8020/tmp, targetPathExists=true, filtersFile='null'}
17/03/02 04:55:12 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:12 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:12 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:12 INFO hdfs.DFSClient: Cannot get delegation token from rm/xlnode-2.hwx.com@HWX.COM
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 25; dirCnt = 10
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Build file listing completed.
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:13 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:13 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:13 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 105 for hive on 172.26.94.233:8020
17/03/02 04:55:13 INFO security.TokenCache: Got dt for hdfs://xlnode-1.hwx.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: number of splits:10
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1488429683114_0003
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:15 INFO impl.YarnClientImpl: Submitted application application_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: The url to track the job: http://xlnode-2.hwx.com:8088/proxy/application_1488429683114_0003/
17/03/02 04:55:15 INFO tools.DistCp: DistCp job-id: job_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: Running job: job_1488429683114_0003
17/03/02 04:55:25 INFO mapreduce.Job: Job job_1488429683114_0003 running in uber mode : false
17/03/02 04:55:25 INFO mapreduce.Job: map 0% reduce 0%
17/03/02 04:55:33 INFO mapreduce.Job: map 10% reduce 0%
17/03/02 04:55:34 INFO mapreduce.Job: map 20% reduce 0%
17/03/02 04:55:37 INFO mapreduce.Job: map 30% reduce 0%
17/03/02 04:55:38 INFO mapreduce.Job: map 40% reduce 0%
17/03/02 04:55:39 INFO mapreduce.Job: map 50% reduce 0%
17/03/02 04:55:40 INFO mapreduce.Job: map 60% reduce 0%
17/03/02 04:55:42 INFO mapreduce.Job: map 80% reduce 0%
17/03/02 04:55:45 INFO mapreduce.Job: map 90% reduce 0%
17/03/02 04:55:48 INFO mapreduce.Job: map 100% reduce 0%
17/03/02 04:55:56 INFO mapreduce.Job: Job job_1488429683114_0003 completed successfully
17/03/02 04:55:57 INFO mapreduce.Job: Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=1492170
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2217305398
HDFS: Number of bytes written=2217296726
HDFS: Number of read operations=233
HDFS: Number of large read operations=0
HDFS: Number of write operations=60
Job Counters
Launched map tasks=10
Other local map tasks=10
Total time spent by all maps in occupied slots (ms)=103725
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=103725
Total vcore-milliseconds taken by all map tasks=103725
Total megabyte-milliseconds taken by all map tasks=106214400
Map-Reduce Framework
Map input records=25
Map output records=0
Input split bytes=1150
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=2385
CPU time spent (ms)=90360
Physical memory (bytes) snapshot=3420536832
Virtual memory (bytes) snapshot=46212300800
Total committed heap usage (bytes)=2546466816
File Input Format Counters
Bytes Read=7522
File Output Format Counters
Bytes Written=0
org.apache.hadoop.tools.mapred.CopyMapper$Counter
BYTESCOPIED=2217296726
BYTESEXPECTED=2217296726
COPY=25
Validate if the data has been copied over succesfully [hive@xlnode-3 ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r-- 3 hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/customer
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/lineitem
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/nation
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/orders
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/part
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/partsupp
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/region
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/supplier
[hive@xlnode-3 ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G /tmp/tpch-generate/2
Assuming that HDP 2.5.3 is a new cluster (without any objects) and includes mysql as the metadata database Stop the Metastore Process Restore the backup from old clusters hive.dump to mysql database [root@xlnode-3 ~]# mysql -u hive -D hive -p < /tmp/hive.dump
Enter password:
Because the hive dump was obtained from old cluster, it still might have the old hdfs location(s) for tables, partitions, use the following/similar metatool command to update the location [hive@xlnode-3 ~]$ export HIVE_CONF_DIR=/etc/hive/2.5.3.0-37/0/conf.server; hive --service metatool -updateLocation hdfs://xlnode-standalone.hwx.com:8020 hdfs://xlnode-3.hwx.com:8020 -tablePropKey avro.schema.url -serdePropKey avro.schema.url
Initializing HiveMetaTool..
17/03/02 05:35:54 INFO metastore.ObjectStore: ObjectStore, initialize called
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.fixedDatastore unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/03/02 05:35:55 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Database,Type,FieldSchema,Order"
17/03/02 05:35:57 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
17/03/02 05:35:57 INFO metastore.ObjectStore: Initialized ObjectStore
Looking for LOCATION_URI field in DBS table to update..
Successfully updated the following locations..
Updated 0 records in DBS table
Looking for LOCATION field in SDS table to update..
Successfully updated the following locations..
Updated 0 records in SDS table
Looking for value of avro.schema.url key in TABLE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in TABLE_PARAMS table
Looking for value of avro.schema.url key in SD_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SD_PARAMS table
Looking for value of avro.schema.url key in SERDE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SERDE_PARAMS table
NOTE If this command fails to update the records/locations in SDS and DBS tables within hive metastore database, then you can perform the updates manually, however, it is STRONGLY recommended that you take a backup of the database at its current state. mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION | NUM_BUCKETS | OUTPUT_FORMAT | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| 1 | 1 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/lineitem | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1 |
| 2 | 2 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/part | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2 |
| 3 | 3 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/supplier | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3 |
| 4 | 4 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/partsupp | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4 |
| 5 | 5 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/nation | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5 |
| 6 | 6 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/region | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 6 |
| 7 | 7 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/customer | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 7 |
| 8 | 8 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/orders | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 8 |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)
mysql> create table SDS_BKP as select * from SDS;
Query OK, 8 rows affected (0.07 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update SDS SET LOCATION = REPLACE(LOCATION,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE LOCATION LIKE '%xlnode-standalone.hwx.com%';
Query OK, 8 rows affected (0.00 sec)
Rows matched: 8 Changed: 8 Warnings: 0
mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION | NUM_BUCKETS | OUTPUT_FORMAT | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| 1 | 1 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/lineitem | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1 |
| 2 | 2 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/part | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2 |
| 3 | 3 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/supplier | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3 |
| 4 | 4 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/partsupp | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4 |
| 5 | 5 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/nation | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5 |
| 6 | 6 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/region | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 6 |
| 7 | 7 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/customer | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 7 |
| 8 | 8 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/orders | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 8 |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)
mysql>
mysql> commit;
Query OK, 0 rows affected (0.04 sec)
mysql> select * from DBS;
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
| 1 | Default Hive database | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse | default | public | ROLE |
| 2 | NULL | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive | USER |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.04 sec)
mysql> create table DBS_BKP as select * from DBS;
Query OK, 2 rows affected (0.04 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update DBS SET DB_LOCATION_URI = REPLACE(DB_LOCATION_URI,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE DB_LOCATION_URI LIKE '%xlnode-standalone.hwx.com%';
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> select * from DBS;
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
| 1 | Default Hive database | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse | default | public | ROLE |
| 2 | NULL | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive | USER |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.03 sec)
mysql>
Next step is to perform the metastore upgrade as the dump was obtained from HDP 2.4.3 (Hive 1.2.1) and there, are/may have, additional objects which were introduced in the new version HDP 2.5.3 (Hive 2.1.0). [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose NOTE This step can produce some challenges, for instance, this tool tries to CREATE some INDEXEs that may already exist, here are two errors which I encountered: [hive@xlnode-3 bin]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.003 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
Error: Duplicate key name 'CONSTRAINTS_PARENT_TABLE_ID_INDEX' (state=42000,code=1061)
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
... 8 more
*** schemaTool failed ***
Above error is generated because our upgrade command is attempting to CREATE an INDEX which already exists, here is how we locate the script. From the above command, look for the main script file name "Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql" because it is a nested script and the scripts are usually located in a predefined place. /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql
[root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/upgrade-2.0.0-to-2.1.0.mysql.sql
SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' ';
SOURCE 034-HIVE-13076.mysql.sql;
SOURCE 035-HIVE-13395.mysql.sql;
SOURCE 036-HIVE-13354.mysql.sql;
UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1;
SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '; The source of our issue is actually from another child script "034-HIVE-13076.mysql.sql" , review its contents and introduce a DROP INDEX command. The script looked something like this: [root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/034-HIVE-13076.mysql.sql
CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS`
(
`CHILD_CD_ID` BIGINT,
`CHILD_INTEGER_IDX` INT(11),
`CHILD_TBL_ID` BIGINT,
`PARENT_CD_ID` BIGINT NOT NULL,
`PARENT_INTEGER_IDX` INT(11) NOT NULL,
`PARENT_TBL_ID` BIGINT NOT NULL,
`POSITION` BIGINT NOT NULL,
`CONSTRAINT_NAME` VARCHAR(400) NOT NULL,
`CONSTRAINT_TYPE` SMALLINT(6) NOT NULL,
`UPDATE_RULE` SMALLINT(6),
`DELETE_RULE` SMALLINT(6),
`ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL,
PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS; <<<<<< Manually added
CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE;
You might see another error stating that the table WRITE_SET already exists, once you rerun the upgradeSchema command after above changes. This table was introduced in HDP version greater than 2.4.3 (Hive 1.2.1) [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.002 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.188 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.075 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Error: Table 'WRITE_SET' already exists (state=42S01,code=1050)
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
... 8 more
*** schemaTool failed ***
Verify that the WRITE_SET table in the current database is empty, if it is, then just add a line at the top to drop this table [root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/035-HIVE-13395.mysql.sql
DROP TABLE WRITE_SET;
CREATE TABLE WRITE_SET (
WS_DATABASE varchar(128) NOT NULL,
WS_TABLE varchar(128) NOT NULL,
WS_PARTITION varchar(767),
WS_TXNID bigint NOT NULL,
WS_COMMIT_ID bigint NOT NULL,
WS_OPERATION_TYPE char(1) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1);
Run the upgradeSchema command again and you should see an output similar to this upon completion [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.018 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.087 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.061 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP TABLE WRITE_SET
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.052 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1)
No rows affected (0.05 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPACTION_QUEUE ADD CQ_TBLPROPERTIES varchar(2048)
No rows affected (0.091 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPLETED_COMPACTIONS ADD CC_TBLPROPERTIES varchar(2048)
No rows affected (0.043 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1
1 row affected (0.016 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+----------------------------------------------------------+--+
| |
+----------------------------------------------------------+--+
| Finished upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+----------------------------------------------------------+--+
1 row selected (0.001 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> !closeall
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
beeline>
beeline>
Completed upgrade-2.0.0-to-2.1.0.mysql.sql
schemaTool completed
[hive@xlnode-3 bin]$
Restart the Metastore Process via Ambari and validate bother version and data for tables
MySQL mysql> select * from VERSION;
+--------+----------------+----------------------------+
| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |
+--------+----------------+----------------------------+
| 1 | 2.1.0 | Hive release version 2.1.0 |
+--------+----------------+----------------------------+
1 row in set (0.00 sec)
mysql>
Hive Database 0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> use tpch_text_2;
No rows affected (0.146 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+-----------+--+
8 rows selected (0.147 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> select count(*) from lineitem;
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: select count(*) from lineitem(Stage-1)
INFO :
INFO : Status: Running (Executing on YARN cluster with App id application_1488431838473_0001)
INFO : Map 1: -/- Reducer 2: 0/1
INFO : Map 1: 0/12 Reducer 2: 0/1
INFO : Map 1: 0(+2)/12 Reducer 2: 0/1
INFO : Map 1: 0(+2)/12 Reducer 2: 0/1
INFO : Map 1: 2(+2)/12 Reducer 2: 0/1
INFO : Map 1: 2(+3)/12 Reducer 2: 0/1
INFO : Map 1: 2(+5)/12 Reducer 2: 0/1
INFO : Map 1: 2(+6)/12 Reducer 2: 0/1
INFO : Map 1: 4(+4)/12 Reducer 2: 0/1
INFO : Map 1: 4(+6)/12 Reducer 2: 0/1
INFO : Map 1: 5(+5)/12 Reducer 2: 0/1
INFO : Map 1: 5(+6)/12 Reducer 2: 0/1
INFO : Map 1: 6(+5)/12 Reducer 2: 0/1
INFO : Map 1: 8(+4)/12 Reducer 2: 0/1
INFO : Map 1: 9(+3)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 10(+2)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 11(+1)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 12/12 Reducer 2: 0(+1)/1
INFO : Map 1: 12/12 Reducer 2: 1/1
+-----------+--+
| _c0 |
+-----------+--+
| 11997996 |
+-----------+--+
1 row selected (30.208 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode->
... View more
Labels:
03-01-2017
11:16 PM
Its not related to HDP 2.5.0, I just encountered the same on 2.4.3. Was resolved for me after changing ambari.properties agent.threadpool.size.max
client.threadpool.size.max
The value should be mapped to actual CPU core count. Also, increased the heap size for namenode and datanode to 2 GB from the default 1GB value.
... View more
02-24-2017
08:29 PM
?logger=com.mysql.jdbc.log.Slf4JLogger&profileSQL=true ?logger=com.mysql.jdbc.log.Slf4JLogger&profileSQL=true
... View more
02-01-2017
11:52 PM
1 Kudo
SYMPTOM: hive> create table mytab (col1 int) location '/tmp/abc';
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=dbloader, access=WRITE, inode="/user/hive":hive:hadoop:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1780)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1764)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1738)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8445)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:2022)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1451)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2200)
)
hive>
ROOT CAUSE: Misconfiguration of the property hive.metastore.warehouse.dir within hive-site.xml. The value reflects the default location where objects should be created, however, if it is set to a location such as "/user/hive" or a directory where the permissions do not apply, it could produce the exception as stated above. RESOLUTION: Ensure that the value is set to the default i.e., "/apps/hive/warehouse" or a directory where you have appropriate permissions. This error would show up with/without impersonation enabled.
... View more
Labels:
01-20-2017
09:24 PM
1 Kudo
Problem When trying to start the secondary hive metastore service, in this example, zookeeper is unable to create the znode with the appropriate permissions. This can be seen mostly on the non-Ambari managed clusters. 17/01/19 18:25:17 ERROR metastore.HiveMetaStore: Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hivedelegationMETASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:473)
at org.apache.hadoop.hive.thrift.HiveDelegationTokenManager.startDelegationTokenSecretManager(HiveDelegationTokenManager.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6031)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5945)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hivedelegationMETASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
Exception in thread "main" org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hivedelegationMETASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:473)
at org.apache.hadoop.hive.thrift.HiveDelegationTokenManager.startDelegationTokenSecretManager(HiveDelegationTokenManager.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6031)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5945)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hivedelegationMETASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
17/01/19 18:25:17 INFO metastore.HiveMetaStore: Shutting down hive metastore.
What to look for/Fix The problem is usually due to the way zookeeper is configured. For instance, if zoo.cfg or zookeeper.env (java.env) in some cases have the following properties set kerberos.removeHostFromPrincipal = true
kerberos.removeRealmFromPrincipal = true
then verify the ACL on the permission via zkCli.sh. In this example, my Zookeeper namespace for hive is set to "hahs2" so here is how the permission looks like [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hahs2
'world,'anyone
: r
'sasl,'hive
: cdrwa
When the permissions are set to strip away the principal and "hive.cluster.delegation.token.store.zookeeper.acl" is not defined, the ACLs should look something like above. If this is not the case, then you would see the ACL set something like this [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hahs2
'sasl,'hive/nodea.openstacklocal@HDP.COM
: cdrwa These steps worked for me Stop the hive server processes i.e. hive metastore and hiveserver2 instances Login via zkCli.sh and try to remove the znode "rmr /hivedelegationMETASTORE" , if this gives error like "No Auth.." then you might need to modify any existence of the following properties to false. This could be there in java.env within /etc/zookeeper/conf or /apache/zookeeper/conf, based on your configuration kerberos.removeHostFromPrincipal = false
kerberos.removeRealmFromPrincipal = false
Launch zkCli again and you should be able to delete the znode Switch back the kerberos stripping properties to default and ensure there is only one place this is defined, either zoo.cfg or java.env (could be zookeeper-env.sh) in some scenarios kerberos.removeHostFromPrincipal = true
kerberos.removeRealmFromPrincipal = true Restart the Zookeeper servers (apply these changes to all the zookeeper servers) Start one of the hivemetastore processes and check if the znodes are created with appropriate permissions i.e. shortnames like this [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hivedelegationMETASTORE
'sasl,'hive
: cdrwa If this is not the case then you might need to add the following in hive-site.xml for either hive instances <name>hive.cluster.delegation.token.store.zookeeper.acl</name>
<value>sasl:hive:cdrwa</value>
Restart the hivemetastore and hiveserver2 processes. This should ideally have the ACLs with shortnames.
... View more
Labels:
01-13-2017
09:38 PM
@Yukti Agrawal How much data have you inserted to compare between the two tables ? Can you try it out with a substantially bigger data set ? Snappy is not very aggressive on reducing the size but rather on the compress/decompress operation.
... View more