Member since
05-10-2016
184
Posts
60
Kudos Received
6
Solutions
03-14-2017
09:23 PM
GOALS
Configure Ranger + Ranger KMS
Create an encryption Zone
OS used is CentOS/Redhat 6.6
At the end, should be able to create an encryption zone and validate using Hive
NOTE: This article is in a walkthrough mode wherein snapshots were taken from relevant screens to create a step-by-step guide on installing Ranger and Ranger KMS along with creation and validation of encrypted zones in Hadoop.
https://www.scribd.com/presentation/341887712/HDFS-Encryption-Zone-Hive-Orig
... View more
03-03-2017
03:18 AM
1 Kudo
GOAL
Migrate hive data and metadata Ensure that metadata is updated based on the target clusters Hive metastore version I have used tpch data for demonstration purposes Steps (Performed on Old Cluster) Take a note of all the directories you need to copy from old to new cluster on HDFS [hive@xlnode-standalone datagen]$ hdfs dfs -ls /apps/hive/warehouse
Found 1 items
drwxrwxrwx - hive hdfs 0 2017-03-01 23:31 /apps/hive/warehouse/tpch_text_2.db
[hive@xlnode-standalone datagen]$ beeline -u "jdbc:hive2://localhost:10000/default" -n hive -p ''
WARNING: Use "yarn jar" to launch YARN applications.
Connecting to jdbc:hive2://localhost:10000/default
Connected to: Apache Hive (version 1.2.1000.2.4.3.0-227)
Driver: Hive JDBC (version 1.2.1000.2.4.3.0-227)
Transaction isolation: TRANSACTION_REPEATABLE_READ
Beeline version 1.2.1000.2.4.3.0-227 by Apache Hive
0: jdbc:hive2://localhost:10000/default> show databases;
+----------------+--+
| database_name |
+----------------+--+
| default |
| tpch_text_2 |
+----------------+--+
2 rows selected (0.143 seconds)
0: jdbc:hive2://localhost:10000/default> use tpch_text_2;
No rows affected (0.06 seconds)
0: jdbc:hive2://localhost:10000/default> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+-----------+--+
8 rows selected (0.069 seconds)
Ensure that the count of data from one/more table(s) for validation 0: jdbc:hive2://localhost:10000/default> select count(*) from lineitem;
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: select count(*) from lineitem(Stage-1)
INFO :
INFO : Status: Running (Executing on YARN cluster with App id application_1488410723867_0003)
INFO : Map 1: -/- Reducer 2: 0/1
INFO : Map 1: 0/6 Reducer 2: 0/1
INFO : Map 1: 0/6 Reducer 2: 0/1
INFO : Map 1: 0(+2)/6 Reducer 2: 0/1
INFO : Map 1: 0(+3)/6 Reducer 2: 0/1
INFO : Map 1: 0(+3)/6 Reducer 2: 0/1
INFO : Map 1: 1(+2)/6 Reducer 2: 0/1
INFO : Map 1: 1(+3)/6 Reducer 2: 0/1
INFO : Map 1: 2(+2)/6 Reducer 2: 0/1
INFO : Map 1: 2(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+2)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 3(+3)/6 Reducer 2: 0/1
INFO : Map 1: 4(+2)/6 Reducer 2: 0(+1)/1
INFO : Map 1: 5(+1)/6 Reducer 2: 0(+1)/1
INFO : Map 1: 6/6 Reducer 2: 0(+1)/1
INFO : Map 1: 6/6 Reducer 2: 1/1
+-----------+--+
| _c0 |
+-----------+--+
| 11997996 |
+-----------+--+
1 row selected (24.331 seconds)
0: jdbc:hive2://localhost:10000/default>
Identify the data to be copied over [hdfs@xlnode-standalone ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r-- 3 hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/customer
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/lineitem
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/nation
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/orders
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/part
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/partsupp
drwxr-xr-x - hive hdfs 0 2017-03-01 23:30 /tmp/tpch-generate/2/region
drwxr-xr-x - hive hdfs 0 2017-03-01 23:31 /tmp/tpch-generate/2/supplier
[hdfs@xlnode-standalone ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G /tmp/tpch-generate/2
Identify the metastore database server and dump the hive metadata database [hive@xlnode-standalone ~]$ mysqldump hive -u hive -p > hive.dump Steps (Performed on New Cluster) When copying the data from a non-secure to secure cluster, we can use this. Run this command as the hive user hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true hdfs://nn1:8020/foo/bar hdfs://nn2:8020/bar/foo Monitor the output to check if there are any failures [hive@xlnode-3 ~]$ hadoop distcp hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate hdfs://xlnode-1.hwx.com:8020/tmp/
17/03/02 04:55:11 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=100, sslConfigurationFile='null', copyStrategy='uniformsize', preserveStatus=[], preserveRawXattrs=false, atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate], targetPath=hdfs://xlnode-1.hwx.com:8020/tmp, targetPathExists=true, filtersFile='null'}
17/03/02 04:55:12 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:12 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:12 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:12 INFO hdfs.DFSClient: Cannot get delegation token from rm/xlnode-2.hwx.com@HWX.COM
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Paths (files+dirs) cnt = 25; dirCnt = 10
17/03/02 04:55:13 INFO tools.SimpleCopyListing: Build file listing completed.
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO tools.DistCp: Number of paths in the copy list: 25
17/03/02 04:55:13 INFO impl.TimelineClientImpl: Timeline service address: http://xlnode-2.hwx.com:8188/ws/v1/timeline/
17/03/02 04:55:13 INFO client.RMProxy: Connecting to ResourceManager at xlnode-2.hwx.com/172.26.94.234:8050
17/03/02 04:55:13 INFO client.AHSProxy: Connecting to Application History server at xlnode-2.hwx.com/172.26.94.234:10200
17/03/02 04:55:13 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 105 for hive on 172.26.94.233:8020
17/03/02 04:55:13 INFO security.TokenCache: Got dt for hdfs://xlnode-1.hwx.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: number of splits:10
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1488429683114_0003
17/03/02 04:55:14 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.94.233:8020, Ident: (HDFS_DELEGATION_TOKEN token 105 for hive)
17/03/02 04:55:15 INFO impl.YarnClientImpl: Submitted application application_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: The url to track the job: http://xlnode-2.hwx.com:8088/proxy/application_1488429683114_0003/
17/03/02 04:55:15 INFO tools.DistCp: DistCp job-id: job_1488429683114_0003
17/03/02 04:55:15 INFO mapreduce.Job: Running job: job_1488429683114_0003
17/03/02 04:55:25 INFO mapreduce.Job: Job job_1488429683114_0003 running in uber mode : false
17/03/02 04:55:25 INFO mapreduce.Job: map 0% reduce 0%
17/03/02 04:55:33 INFO mapreduce.Job: map 10% reduce 0%
17/03/02 04:55:34 INFO mapreduce.Job: map 20% reduce 0%
17/03/02 04:55:37 INFO mapreduce.Job: map 30% reduce 0%
17/03/02 04:55:38 INFO mapreduce.Job: map 40% reduce 0%
17/03/02 04:55:39 INFO mapreduce.Job: map 50% reduce 0%
17/03/02 04:55:40 INFO mapreduce.Job: map 60% reduce 0%
17/03/02 04:55:42 INFO mapreduce.Job: map 80% reduce 0%
17/03/02 04:55:45 INFO mapreduce.Job: map 90% reduce 0%
17/03/02 04:55:48 INFO mapreduce.Job: map 100% reduce 0%
17/03/02 04:55:56 INFO mapreduce.Job: Job job_1488429683114_0003 completed successfully
17/03/02 04:55:57 INFO mapreduce.Job: Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=1492170
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2217305398
HDFS: Number of bytes written=2217296726
HDFS: Number of read operations=233
HDFS: Number of large read operations=0
HDFS: Number of write operations=60
Job Counters
Launched map tasks=10
Other local map tasks=10
Total time spent by all maps in occupied slots (ms)=103725
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=103725
Total vcore-milliseconds taken by all map tasks=103725
Total megabyte-milliseconds taken by all map tasks=106214400
Map-Reduce Framework
Map input records=25
Map output records=0
Input split bytes=1150
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=2385
CPU time spent (ms)=90360
Physical memory (bytes) snapshot=3420536832
Virtual memory (bytes) snapshot=46212300800
Total committed heap usage (bytes)=2546466816
File Input Format Counters
Bytes Read=7522
File Output Format Counters
Bytes Written=0
org.apache.hadoop.tools.mapred.CopyMapper$Counter
BYTESCOPIED=2217296726
BYTESEXPECTED=2217296726
COPY=25
Validate if the data has been copied over succesfully [hive@xlnode-3 ~]$ hdfs dfs -ls /tmp/tpch-generate/2
Found 9 items
-rw-r--r-- 3 hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/_SUCCESS
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/customer
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/lineitem
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/nation
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/orders
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/part
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/partsupp
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/region
drwxr-xr-x - hive hdfs 0 2017-03-02 04:55 /tmp/tpch-generate/2/supplier
[hive@xlnode-3 ~]$ hdfs dfs -du -s -h /tmp/tpch-generate/2
2.1 G /tmp/tpch-generate/2
Assuming that HDP 2.5.3 is a new cluster (without any objects) and includes mysql as the metadata database Stop the Metastore Process Restore the backup from old clusters hive.dump to mysql database [root@xlnode-3 ~]# mysql -u hive -D hive -p < /tmp/hive.dump
Enter password:
Because the hive dump was obtained from old cluster, it still might have the old hdfs location(s) for tables, partitions, use the following/similar metatool command to update the location [hive@xlnode-3 ~]$ export HIVE_CONF_DIR=/etc/hive/2.5.3.0-37/0/conf.server; hive --service metatool -updateLocation hdfs://xlnode-standalone.hwx.com:8020 hdfs://xlnode-3.hwx.com:8020 -tablePropKey avro.schema.url -serdePropKey avro.schema.url
Initializing HiveMetaTool..
17/03/02 05:35:54 INFO metastore.ObjectStore: ObjectStore, initialize called
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.fixedDatastore unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
17/03/02 05:35:54 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
17/03/02 05:35:55 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,Database,Type,FieldSchema,Order"
17/03/02 05:35:57 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
17/03/02 05:35:57 INFO metastore.ObjectStore: Initialized ObjectStore
Looking for LOCATION_URI field in DBS table to update..
Successfully updated the following locations..
Updated 0 records in DBS table
Looking for LOCATION field in SDS table to update..
Successfully updated the following locations..
Updated 0 records in SDS table
Looking for value of avro.schema.url key in TABLE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in TABLE_PARAMS table
Looking for value of avro.schema.url key in SD_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SD_PARAMS table
Looking for value of avro.schema.url key in SERDE_PARAMS table to update..
Successfully updated the following locations..
Updated 0 records in SERDE_PARAMS table
NOTE If this command fails to update the records/locations in SDS and DBS tables within hive metastore database, then you can perform the updates manually, however, it is STRONGLY recommended that you take a backup of the database at its current state. mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION | NUM_BUCKETS | OUTPUT_FORMAT | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| 1 | 1 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/lineitem | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1 |
| 2 | 2 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/part | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2 |
| 3 | 3 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/supplier | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3 |
| 4 | 4 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/partsupp | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4 |
| 5 | 5 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/nation | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5 |
| 6 | 6 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/region | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 6 |
| 7 | 7 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/customer | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 7 |
| 8 | 8 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-standalone.hwx.com:8020/tmp/tpch-generate/2/orders | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 8 |
+-------+-------+------------------------------------------+---------------+---------------------------+--------------------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)
mysql> create table SDS_BKP as select * from SDS;
Query OK, 8 rows affected (0.07 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update SDS SET LOCATION = REPLACE(LOCATION,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE LOCATION LIKE '%xlnode-standalone.hwx.com%';
Query OK, 8 rows affected (0.00 sec)
Rows matched: 8 Changed: 8 Warnings: 0
mysql> select * from SDS;
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| SD_ID | CD_ID | INPUT_FORMAT | IS_COMPRESSED | IS_STOREDASSUBDIRECTORIES | LOCATION | NUM_BUCKETS | OUTPUT_FORMAT | SERDE_ID |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
| 1 | 1 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/lineitem | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 1 |
| 2 | 2 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/part | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 2 |
| 3 | 3 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/supplier | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 3 |
| 4 | 4 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/partsupp | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 4 |
| 5 | 5 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/nation | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 5 |
| 6 | 6 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/region | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 6 |
| 7 | 7 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/customer | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 7 |
| 8 | 8 | org.apache.hadoop.mapred.TextInputFormat | | | hdfs://xlnode-1.hwx.com:8020/tmp/tpch-generate/2/orders | -1 | org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | 8 |
+-------+-------+------------------------------------------+---------------+---------------------------+-----------------------------------------------------------+-------------+------------------------------------------------------------+----------+
8 rows in set (0.00 sec)
mysql>
mysql> commit;
Query OK, 0 rows affected (0.04 sec)
mysql> select * from DBS;
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
| 1 | Default Hive database | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse | default | public | ROLE |
| 2 | NULL | hdfs://xlnode-standalone.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive | USER |
+-------+-----------------------+--------------------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.04 sec)
mysql> create table DBS_BKP as select * from DBS;
Query OK, 2 rows affected (0.04 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> update DBS SET DB_LOCATION_URI = REPLACE(DB_LOCATION_URI,'hdfs://xlnode-standalone.hwx.com:8020','hdfs://xlnode-1.hwx.com:8020') WHERE DB_LOCATION_URI LIKE '%xlnode-standalone.hwx.com%';
Query OK, 2 rows affected (0.00 sec)
Rows matched: 2 Changed: 2 Warnings: 0
mysql> select * from DBS;
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
| DB_ID | DESC | DB_LOCATION_URI | NAME | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
| 1 | Default Hive database | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse | default | public | ROLE |
| 2 | NULL | hdfs://xlnode-1.hwx.com:8020/apps/hive/warehouse/tpch_text_2.db | tpch_text_2 | hive | USER |
+-------+-----------------------+-----------------------------------------------------------------+-------------+------------+------------+
2 rows in set (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.03 sec)
mysql>
Next step is to perform the metastore upgrade as the dump was obtained from HDP 2.4.3 (Hive 1.2.1) and there, are/may have, additional objects which were introduced in the new version HDP 2.5.3 (Hive 2.1.0). [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose NOTE This step can produce some challenges, for instance, this tool tries to CREATE some INDEXEs that may already exist, here are two errors which I encountered: [hive@xlnode-3 bin]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.003 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
Error: Duplicate key name 'CONSTRAINTS_PARENT_TABLE_ID_INDEX' (state=42000,code=1061)
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
... 8 more
*** schemaTool failed ***
Above error is generated because our upgrade command is attempting to CREATE an INDEX which already exists, here is how we locate the script. From the above command, look for the main script file name "Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql" because it is a nested script and the scripts are usually located in a predefined place. /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql
[root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/upgrade-2.0.0-to-2.1.0.mysql.sql
SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' ';
SOURCE 034-HIVE-13076.mysql.sql;
SOURCE 035-HIVE-13395.mysql.sql;
SOURCE 036-HIVE-13354.mysql.sql;
UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1;
SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '; The source of our issue is actually from another child script "034-HIVE-13076.mysql.sql" , review its contents and introduce a DROP INDEX command. The script looked something like this: [root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/034-HIVE-13076.mysql.sql
CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS`
(
`CHILD_CD_ID` BIGINT,
`CHILD_INTEGER_IDX` INT(11),
`CHILD_TBL_ID` BIGINT,
`PARENT_CD_ID` BIGINT NOT NULL,
`PARENT_INTEGER_IDX` INT(11) NOT NULL,
`PARENT_TBL_ID` BIGINT NOT NULL,
`POSITION` BIGINT NOT NULL,
`CONSTRAINT_NAME` VARCHAR(400) NOT NULL,
`CONSTRAINT_TYPE` SMALLINT(6) NOT NULL,
`UPDATE_RULE` SMALLINT(6),
`DELETE_RULE` SMALLINT(6),
`ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL,
PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS; <<<<<< Manually added
CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE;
You might see another error stating that the table WRITE_SET already exists, once you rerun the upgradeSchema command after above changes. This table was introduced in HDP version greater than 2.4.3 (Hive 1.2.1) [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.011 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.002 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.188 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.075 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
Error: Table 'WRITE_SET' already exists (state=42S01,code=1050)
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:263)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:231)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.io.IOException: Schema script failed, errorcode 2
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:410)
at org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:367)
at org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:258)
... 8 more
*** schemaTool failed ***
Verify that the WRITE_SET table in the current database is empty, if it is, then just add a line at the top to drop this table [root@xlnode-3 ~]# cat /usr/hdp/2.5.3.0-37/hive2/scripts/metastore/upgrade/mysql/035-HIVE-13395.mysql.sql
DROP TABLE WRITE_SET;
CREATE TABLE WRITE_SET (
WS_DATABASE varchar(128) NOT NULL,
WS_TABLE varchar(128) NOT NULL,
WS_PARTITION varchar(767),
WS_TXNID bigint NOT NULL,
WS_COMMIT_ID bigint NOT NULL,
WS_OPERATION_TYPE char(1) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1);
Run the upgradeSchema command again and you should see an output similar to this upon completion [hive@xlnode-3 ~]$ cd /usr/hdp/2.5.3.0-37/hive2/bin/ && export HIVE_CONF_DIR=/etc/hive/conf/conf.server; ./schematool -dbType mysql -upgradeSchema --verbose
which: no hbase in (/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hive/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hive2/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.5.3.0-37/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
Starting upgrade metastore schema from version 2.0.0 to 2.1.0
Upgrade script upgrade-2.0.0-to-2.1.0.mysql.sql
Connecting to jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
Connected to: MySQL (version 5.1.73)
Driver: MySQL-AB JDBC Driver (version mysql-connector-java-5.1.17-SNAPSHOT ( Revision: ${bzr.revision-id} ))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:mysql://xlnode-3.hwx.com/hive> !autocommit on
Autocommit status: true
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+-------------------------------------------------+--+
| |
+-------------------------------------------------+--+
| Upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+-------------------------------------------------+--+
1 row selected (0.018 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE IF NOT EXISTS `KEY_CONSTRAINTS` ( `CHILD_CD_ID` BIGINT, `CHILD_INTEGER_IDX` INT(11), `CHILD_TBL_ID` BIGINT, `PARENT_CD_ID` BIGINT NOT NULL, `PARENT_INTEGER_IDX` INT(11) NOT NULL, `PARENT_TBL_ID` BIGINT NOT NULL, `POSITION` BIGINT NOT NULL, `CONSTRAINT_NAME` VARCHAR(400) NOT NULL, `CONSTRAINT_TYPE` SMALLINT(6) NOT NULL, `UPDATE_RULE` SMALLINT(6), `DELETE_RULE` SMALLINT(6), `ENABLE_VALIDATE_RELY` SMALLINT(6) NOT NULL, PRIMARY KEY (`CONSTRAINT_NAME`, `POSITION`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP INDEX CONSTRAINTS_PARENT_TABLE_ID_INDEX ON KEY_CONSTRAINTS
No rows affected (0.087 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE INDEX `CONSTRAINTS_PARENT_TABLE_ID_INDEX` ON KEY_CONSTRAINTS (`PARENT_TBL_ID`) USING BTREE
No rows affected (0.061 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> DROP TABLE WRITE_SET
No rows affected (0.004 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> CREATE TABLE WRITE_SET ( WS_DATABASE varchar(128) NOT NULL, WS_TABLE varchar(128) NOT NULL, WS_PARTITION varchar(767), WS_TXNID bigint NOT NULL, WS_COMMIT_ID bigint NOT NULL, WS_OPERATION_TYPE char(1) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1
No rows affected (0.052 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE TXN_COMPONENTS ADD TC_OPERATION_TYPE char(1)
No rows affected (0.05 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPACTION_QUEUE ADD CQ_TBLPROPERTIES varchar(2048)
No rows affected (0.091 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> ALTER TABLE COMPLETED_COMPACTIONS ADD CC_TBLPROPERTIES varchar(2048)
No rows affected (0.043 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> UPDATE VERSION SET SCHEMA_VERSION='2.1.0', VERSION_COMMENT='Hive release version 2.1.0' where VER_ID=1
1 row affected (0.016 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> SELECT 'Finished upgrading MetaStore schema from 2.0.0 to 2.1.0' AS ' '
+----------------------------------------------------------+--+
| |
+----------------------------------------------------------+--+
| Finished upgrading MetaStore schema from 2.0.0 to 2.1.0 |
+----------------------------------------------------------+--+
1 row selected (0.001 seconds)
0: jdbc:mysql://xlnode-3.hwx.com/hive> !closeall
Closing: 0: jdbc:mysql://xlnode-3.hwx.com/hive?createDatabaseIfNotExist=true
beeline>
beeline>
Completed upgrade-2.0.0-to-2.1.0.mysql.sql
schemaTool completed
[hive@xlnode-3 bin]$
Restart the Metastore Process via Ambari and validate bother version and data for tables
MySQL mysql> select * from VERSION;
+--------+----------------+----------------------------+
| VER_ID | SCHEMA_VERSION | VERSION_COMMENT |
+--------+----------------+----------------------------+
| 1 | 2.1.0 | Hive release version 2.1.0 |
+--------+----------------+----------------------------+
1 row in set (0.00 sec)
mysql>
Hive Database 0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> use tpch_text_2;
No rows affected (0.146 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> show tables;
+-----------+--+
| tab_name |
+-----------+--+
| customer |
| lineitem |
| nation |
| orders |
| part |
| partsupp |
| region |
| supplier |
+-----------+--+
8 rows selected (0.147 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode-> select count(*) from lineitem;
INFO : Tez session hasn't been created yet. Opening session
INFO : Dag name: select count(*) from lineitem(Stage-1)
INFO :
INFO : Status: Running (Executing on YARN cluster with App id application_1488431838473_0001)
INFO : Map 1: -/- Reducer 2: 0/1
INFO : Map 1: 0/12 Reducer 2: 0/1
INFO : Map 1: 0(+2)/12 Reducer 2: 0/1
INFO : Map 1: 0(+2)/12 Reducer 2: 0/1
INFO : Map 1: 2(+2)/12 Reducer 2: 0/1
INFO : Map 1: 2(+3)/12 Reducer 2: 0/1
INFO : Map 1: 2(+5)/12 Reducer 2: 0/1
INFO : Map 1: 2(+6)/12 Reducer 2: 0/1
INFO : Map 1: 4(+4)/12 Reducer 2: 0/1
INFO : Map 1: 4(+6)/12 Reducer 2: 0/1
INFO : Map 1: 5(+5)/12 Reducer 2: 0/1
INFO : Map 1: 5(+6)/12 Reducer 2: 0/1
INFO : Map 1: 6(+5)/12 Reducer 2: 0/1
INFO : Map 1: 8(+4)/12 Reducer 2: 0/1
INFO : Map 1: 9(+3)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 10(+2)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 11(+1)/12 Reducer 2: 0(+1)/1
INFO : Map 1: 12/12 Reducer 2: 0(+1)/1
INFO : Map 1: 12/12 Reducer 2: 1/1
+-----------+--+
| _c0 |
+-----------+--+
| 11997996 |
+-----------+--+
1 row selected (30.208 seconds)
0: jdbc:hive2://xlnode-3.hwx.com:2181,xlnode->
... View more
Labels:
02-24-2017
08:29 PM
?logger=com.mysql.jdbc.log.Slf4JLogger&profileSQL=true ?logger=com.mysql.jdbc.log.Slf4JLogger&profileSQL=true
... View more
02-01-2017
11:52 PM
1 Kudo
SYMPTOM: hive> create table mytab (col1 int) location '/tmp/abc';
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.security.AccessControlException: Permission denied: user=dbloader, access=WRITE, inode="/user/hive":hive:hadoop:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:219)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1780)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1764)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1738)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8445)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:2022)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1451)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2206)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2202)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2200)
)
hive>
ROOT CAUSE: Misconfiguration of the property hive.metastore.warehouse.dir within hive-site.xml. The value reflects the default location where objects should be created, however, if it is set to a location such as "/user/hive" or a directory where the permissions do not apply, it could produce the exception as stated above. RESOLUTION: Ensure that the value is set to the default i.e., "/apps/hive/warehouse" or a directory where you have appropriate permissions. This error would show up with/without impersonation enabled.
... View more
Labels:
01-20-2017
09:24 PM
1 Kudo
Problem When trying to start the secondary hive metastore service, in this example, zookeeper is unable to create the znode with the appropriate permissions. This can be seen mostly on the non-Ambari managed clusters. 17/01/19 18:25:17 ERROR metastore.HiveMetaStore: Metastore Thrift Server threw an exception...
org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hivedelegationMETASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:473)
at org.apache.hadoop.hive.thrift.HiveDelegationTokenManager.startDelegationTokenSecretManager(HiveDelegationTokenManager.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6031)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5945)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hivedelegationMETASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
Exception in thread "main" org.apache.hadoop.hive.thrift.DelegationTokenStore$TokenStoreException: Error creating path /hivedelegationMETASTORE/keys
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:166)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.initClientAndPaths(ZooKeeperTokenStore.java:236)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.init(ZooKeeperTokenStore.java:473)
at org.apache.hadoop.hive.thrift.HiveDelegationTokenManager.startDelegationTokenSecretManager(HiveDelegationTokenManager.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:6031)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5945)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth for /hivedelegationMETASTORE/keys
at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:691)
at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:675)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:672)
at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:453)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:443)
at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:423)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:257)
at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:205)
at org.apache.hadoop.hive.thrift.ZooKeeperTokenStore.ensurePath(ZooKeeperTokenStore.java:160)
... 11 more
17/01/19 18:25:17 INFO metastore.HiveMetaStore: Shutting down hive metastore.
What to look for/Fix The problem is usually due to the way zookeeper is configured. For instance, if zoo.cfg or zookeeper.env (java.env) in some cases have the following properties set kerberos.removeHostFromPrincipal = true
kerberos.removeRealmFromPrincipal = true
then verify the ACL on the permission via zkCli.sh. In this example, my Zookeeper namespace for hive is set to "hahs2" so here is how the permission looks like [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hahs2
'world,'anyone
: r
'sasl,'hive
: cdrwa
When the permissions are set to strip away the principal and "hive.cluster.delegation.token.store.zookeeper.acl" is not defined, the ACLs should look something like above. If this is not the case, then you would see the ACL set something like this [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hahs2
'sasl,'hive/nodea.openstacklocal@HDP.COM
: cdrwa These steps worked for me Stop the hive server processes i.e. hive metastore and hiveserver2 instances Login via zkCli.sh and try to remove the znode "rmr /hivedelegationMETASTORE" , if this gives error like "No Auth.." then you might need to modify any existence of the following properties to false. This could be there in java.env within /etc/zookeeper/conf or /apache/zookeeper/conf, based on your configuration kerberos.removeHostFromPrincipal = false
kerberos.removeRealmFromPrincipal = false
Launch zkCli again and you should be able to delete the znode Switch back the kerberos stripping properties to default and ensure there is only one place this is defined, either zoo.cfg or java.env (could be zookeeper-env.sh) in some scenarios kerberos.removeHostFromPrincipal = true
kerberos.removeRealmFromPrincipal = true Restart the Zookeeper servers (apply these changes to all the zookeeper servers) Start one of the hivemetastore processes and check if the znodes are created with appropriate permissions i.e. shortnames like this [zk: nodea.openstacklocal(CONNECTED) 1] getAcl /hivedelegationMETASTORE
'sasl,'hive
: cdrwa If this is not the case then you might need to add the following in hive-site.xml for either hive instances <name>hive.cluster.delegation.token.store.zookeeper.acl</name>
<value>sasl:hive:cdrwa</value>
Restart the hivemetastore and hiveserver2 processes. This should ideally have the ACLs with shortnames.
... View more
Labels:
01-13-2017
05:47 AM
2 Kudos
Goal The goal of of this article is to get familiar with the most convenient method of accessing HAWQ Database i.e. using psql native client Assumptions
HAWQ Database is available and active PSQL Native Client (Available for download @ https://network.pivotal.io) PSQL or "psql" (the binary command) is the ideal choice to connect to HAWQ database/warehouse because HAWQ itself is based on Postgres database. Steps to configure psql on Client/Edge node:
Make a note of the Public IP address for the Client/Edge Node, this will be needed later. Access to HAWQ is available by default from HAWQ master host, make a note of HAWQ database name and user you would want to use for enabling database access.
To identify name of database (psql -l) from the gpadmin user prompt in linux To identify available user names "psql -c \du" Download the HAWQ installation binaries from the following site https://network.pivotal.io/products/pivotal-hdb#/releases/3466/file_groups/520 Follow this guide to setup the HAWQ repository http://hdb.docs.pivotal.io/211/hdb/install/setup-hdb-repos.html Install psql binary, this should be available from the repo we just set up [root@sandbox ~]# yum install hawq
Now the client is installed, next ensure that HAWQ listens to this edge node. To do this, you need to identify the data directory for HAWQ master, here is how we can do that. Login to HAWQ master host via SSH as gpadmin user & execute the following command [root@sandbox ~]# ps aux | egrep silent
gpadmin 209268 0.1 3.5 591072 357720 ? Ss 03:02 0:00 /usr/local/hawq_2_1_0_0/bin/postgres -D /data/hawq/master -i -M master -p 5433 --silent-mode=true
gpadmin 209428 0.2 4.0 693584 407636 ? Ss 03:02 0:01 /usr/local/hawq_2_1_0_0/bin/postgres -D /data/hawq/segment -i -M segment -p 40000 --silent-mode=true
root 212230 0.0 0.0 8456 924 pts/1 S+ 03:12 0:00 egrep silent
NOTE: You might see an output like above, this is because the demo setup was on a sandbox, so both master and segment nodes are on the same sandbox. We need to look at the path which says "master", in this case "/data/hawq/master.
Change directory to master data folder and make a copy of the file called pg_hba.conf [root@sandbox ~]# cd /data/hawq/master
[root@sandbox master]# cp pg_hba.conf pg_hba.conf.orig
File "pg_hba.conf" will already have entries like the following local all gpadmin ident
host all gpadmin 127.0.0.1/28 trust
host all gpadmin 172.16.105.183/32 trust
Here "host" or "local" specifies if the client entry point is local (on same host) or "host" from a different node (either within the same cluster or on the LAN) "all" basically allows access to all the databases, this can be specific, for instance, "hdw" provided that is the name of your database "gpadmin" is the default superuser for HAWQ cluster, this user has access to the cluster (within specified IP ranges) IP address is either the 32-bit absolute IP address or could be opened to a subnet "trust" is the method Now to understand the last entry in the above example, we are providing access for from any "host", provided "all"/any database only when the user is "gpadmin" trying to connect from client which has the IP address "172.16.105.183" and method is "trust", so there is no password required. To understand pg_hba.conf better, here is the link: https://www.postgresql.org/docs/8.3/static/auth-pg-hba-conf.html
Now assuming that your edge node has an IP address of "172.26.78.183", database name is "test", username is "myuser", and you want to use securely use minimally secure login method "md5" then the entry would look something like this. host test myuser 172.26.78.183/32 md5
Once the entry is made, we need to ensure that HAWQ is infact open to listening to remote hosts. A "*" signifies that HAWQ can listen to all remote hosts. [gpadmin@sandbox ~]$ egrep -i --color listen_address /data/hawq/master/postgresql.conf
#listen_addresses = '*' # what IP address(es) to listen on;
Now ensure that our entry in pg_hba.conf file is cached by HAWQ process, hence, HAWQ will be able to restrict the creation new processes. (We need to be gpadmin user) [gpadmin@sandbox ~]$ /usr/local/hawq/bin/pg_ctl reload -D /data/hawq/master
server signaled
[gpadmin@sandbox ~]$
You should be able to connect using the psql now [gpadmin@sandbox ~]$ psql -U myuser -h 172.16.105.183 -d test
Password for user myuser:
psql (8.2.15)
Type "help" for help.
test=>
... View more
11-10-2016
02:35 AM
1 Kudo
Goal Access Hive through R shell with Rhive Authentication mechanism: Kerberos
Assumptions Have a HDP cluster up and running Have configured and installed R Cluster is configured on CentOS7 Need to have openssl-devel installed on the server where R is installed If using openjdk, ensure that the JAVA_HOME is set to the path where javac is available Steps Download Rserve wget https://rforge.net/Rserve/snapshot/Rserve_1.8-5.tar.gz Use R to install Rserve R CMD INSTALL Rserve_1.8-5.tar.gz Download rhive library git clone https://github.com/nexr/RHive.gita Set the appropriate path for HIVE_HOME and HADOOP_HOME export HADOOP_HOME=/usr/hdp/2.5.0.0-1245/hadoop; export HIVE_HOME=/usr/hdp/2.5.0.0-1245/hivea Change directory to RHive cd RHive Use ant to build ant build Use R to build RHive, this will build a file within the same directory R CMD build RHive Use the newly created file name to install RHive library R CMD INSTALL RHive_2.0-0.10.tar.gz Demo Launch R using "R" at the unix prompt Load RHive library > library(RHive)
Loading required package: rJava
Loading required package: Rserve
Use the connection string in this format > rhive.connect(host="node1.hortonworks.com:10000/default;principal=hive/node1.hortonworks.com@HDP.COM;AuthMech=1;KrbHostFQDN=service.hortonworks.com;KrbServiceName=hive;KrbRealm=HDP.COM", defaultFS="hdfs://node1.hortonworks.com/rhive", hiveServer2=TRUE,updateJar=FALSE)
2016-11-09 22:30:00,425 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-11-09 22:30:01,361 WARN [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2016-11-09 22:30:01,853 INFO [Thread-7] jdbc.Utils (Utils.java:parseURL(316)) - Supplied authorities: node1.hortonworks.com:10000
2016-11-09 22:30:01,853 INFO [Thread-7] jdbc.Utils (Utils.java:parseURL(432)) - Resolved authority: node1.hortonworks.com:10000
Querying a table > rhive.query("SELECT * FROM default.test")
test.col1 test.col2
1 1 1
2 2 2
3 3 3
4 4 4
>
Enjoy !!
... View more
Labels:
11-07-2016
06:20 PM
2 Kudos
Error org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : Access denied for user 'hive'@'node1.hortonworks.com' (using password: YES)
SQL Error code: 1045
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
at org.apache.hive.beeline.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:80)
at org.apache.hive.beeline.HiveSchemaTool.getConnectionToMetastore(HiveSchemaTool.java:133)
at org.apache.hive.beeline.HiveSchemaTool.testConnectionToMetastore(HiveSchemaTool.java:187)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:291)
at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:277)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:526)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233) Background The above output is an attempt of Ambari trying to initiate the hive metastore process based on the parameters supplied in the following properites javax.jdo.option.ConnectionDriverName
javax.jdo.option.ConnectionURL
javax.jdo.option.ConnectionUserName
This is mostly not required if all the credentials are aligned, however, its possible we might end up with such issues when we are trying to use customized DB access. The above error is relevant to mysql db as metastore repository, however, same principal can be applied to other DBs as well. Mysql's authentication mechanism allows user/host credentials to be defined in a certain order. Here is an example. mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> select user,host from user;
+------+-----------------------+
| user | host |
+------+-----------------------+
| hive | % |
| hive | 127.0.0.1 |
| root | 127.0.0.1 |
| hive | ::1 |
| root | ::1 |
| hive | localhost |
| root | localhost |
| hive | node1.hortonworks.com |
| root | node1.hortonworks.com |
| hive | node2.hortonworks.com |
| hive | node3.hortonworks.com |
+------+-----------------------+
11 rows in set (0.00 sec)
We can verify the error as indicated by the ambari log output at mysql level, using the superuser access, because this is what Ambari server is trying to do. [root@node1 ~]# mysql -u hive -h node1.hortonworks.com -p
Enter password:
ERROR 1045 (28000): Access denied for user 'hive'@'node1.hortonworks.com' (using password: YES)
How-to-fix Observe the output before exception from Ambari log, look at the username, password and the command string which is being attempted. resource_management.core.exceptions.Fail: Execution of 'export HIVE_CONF_DIR=/usr/hdp/current/hive-metastore/conf/conf.server ; /usr/hdp/current/hive-server2-hive2/bin/schematool -initSchema -dbType mysql -userName hive -passWord [PROTECTED] -verbose' returned 1. which: no hbase in (/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/jre/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/var/lib/ambari-agent:/var/lib/ambari-agent) If you are not using the root user, ensure that you add another property to hive-site.xml i.e., javax.jdo.option.ConnectionPassword with password for the user that is configured for property javax.jdo.option.ConnectionUserName At mysql level, login as root user and grant with the following syntax (you might need to execute 'flush privileges' post this, not necessary. We do need to note that if you do not specify "identified by", the error would still remain even though the user 'hive' has been configured to be identified by 'password' which was set while creating the user. mysql> grant all on *.* to 'hive'@'node1.hortonworks.com' identified by 'changeme';
Query OK, 0 rows affected (0.00 sec) Even if local to the host, test out if you are able to login with the supplied credentials [root@node1 ~]# mysql -u hive -h node1.hortonworks.com -p
Enter password:
... View more
Labels:
10-26-2016
11:05 PM
Problem Using rhive with following parameters to connect to hive library(RHive)
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
Sys.setenv(RHIVE_HIVESERVER_VERSION="2")
Sys.setenv(HIVE_HOME="/usr/hdp/current/hive-client")
Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client")
rhive.init()
> rhive.connect()
2016-10-26 13:06:22,836 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1173)) - fs.default.name is deprecated. Instead, use fs.defaultFS
2016-10-26 13:06:23,720 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-10-26 13:06:24,631 WARN [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) - The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
2016-10-26 13:06:25,237 INFO [Thread-5] jdbc.Utils (Utils.java:parseURL(309)) - Supplied authorities: 127.0.0.1:10000
2016-10-26 13:06:25,240 INFO [Thread-5] jdbc.Utils (Utils.java:parseURL(397)) - Resolved authority: 127.0.0.1:10000
2016-10-26 13:06:25,292 INFO [Thread-5] jdbc.HiveConnection (HiveConnection.java:openTransport(209)) - Will try to open client transport with JDBC Uri: jdbc:hive2://127.0.0.1:10000/default
Exception in thread "Thread-5" java.lang.RuntimeException: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.connect(HiveJdbcClient.java:332)
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.run(HiveJdbcClient.java:314)
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:255)
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:246)
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:592)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:195)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at com.nexr.rhive.hive.DatabaseConnection.connect(DatabaseConnection.java:52)
at com.nexr.rhive.hive.HiveJdbcClient$HiveJdbcConnector.connect(HiveJdbcClient.java:325)
... 1 more
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:266)
at org.apache.hive.service.cli.CLIService.openSessionWithImpersonation(CLIService.java:202)
at org.apache.hive.service.cli.thrift.ThriftCLIService.getSessionHandle(ThriftCLIService.java:402)
at org.apache.hive.service.cli.thrift.ThriftCLIService.OpenSession(ThriftCLIService.java:297)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1253)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$OpenSession.getResult(TCLIService.java:1238)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy20.open(Unknown Source)
at org.apache.hive.service.cli.session.SessionManager.openSession(SessionManager.java:258)
... 12 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.hive.service.cli.session.HiveSessionImpl.open(HiveSessionImpl.java:137)
at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
... 20 more
Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:Username: 'anonymous' not found. Make sure your client's username exists on the cluster
at org.apache.hadoop.ipc.Client.call(Client.java:1427)
at org.apache.hadoop.ipc.Client.call(Client.java:1358)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:596)
at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:554)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:508)
... 25 more
Resolution The mechanism to pass on a user name resolves the issue. Here is the example of the passing the connection string which works for rhive. library(RHive)
Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop/conf")
Sys.setenv(RHIVE_HIVESERVER_VERSION="2")
Sys.setenv(HIVE_HOME="/usr/hdp/current/hive-client")
Sys.setenv(HADOOP_HOME="/usr/hdp/current/hadoop-client")
rhive.init()
rhiveConnection<-rhive.connect(host="127.0.0.1", port=10000, hiveServer2=NA, defaultFS=NULL, updateJar=FALSE, user="hive", password=NULL, db="default", properties = character(0))
NOTE: Ensure following parameters are set correctly JAVA_HOME PATH should pick our JAVA_HOME's binary so "export PATH=$JAVA_HOME/bin:${PATH}" Rhive looks for binaries, thus when setting the HIVE_HOME and HADOOP_HOME, ensure that hive-client and hadoop-client directories are being picked. Verify environment variables for rhive using rhive.env() Very important to use doube quotes "" for enclosing the string/character values, like user/database name
... View more
Labels:
10-25-2016
08:06 PM
2 Kudos
Observed in Teradata's distribution of Hortonworks, when ACID is enabled via Ambari, following error can show up. Error: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract (state=,code=0)
org.apache.hive.service.cli.HiveSQLException: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244)
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:247)
at org.apache.hive.beeline.Commands.execute(Commands.java:848)
at org.apache.hive.beeline.Commands.sql(Commands.java:713)
at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:983)
at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:718)
at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:767)
at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485)
at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.RuntimeException: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:83)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.AbstractMethodError: Method org/postgresql/jdbc3/Jdbc3ResultSet.isClosed()Z is abstract
at org.postgresql.jdbc3.Jdbc3ResultSet.isClosed(Jdbc3ResultSet.java)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:1047)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.close(TxnHandler.java:1059)
at org.apache.hadoop.hive.metastore.txn.TxnHandler.getOpenTxns(TxnHandler.java:279)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_open_txns(HiveMetaStore.java:5567)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy8.get_open_txns(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getValidTxns(HiveMetaStoreClient.java:1829)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
at com.sun.proxy.$Proxy9.getValidTxns(Unknown Source)
at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidTxns(DbTxnManager.java:327)
at org.apache.hadoop.hive.ql.Driver.recordValidTxns(Driver.java:994)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:441)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:406)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
... 18 more
Closing: 0: jdbc:hive2://localhost:10000/default
This has nothing to do with the distribution itself, however, it is a result of the JDBC driver copied in this location: /usr/hdp/2.x.x.x-xxx/hive/lib Resolution Find the version of your metadata database, ideally it could be one of the following (existing databases)
MySQL Oracle PostgreSQL Find the compliant JDBC drivers
For MySQL (finding the version of DB) or ask your Database Administrator who manages this DB [root@sandbox lib]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 41
Server version: 5.1.73 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
In the above scenario, version of MySQL DB is 5.1.73. Ensure that the relevant JDBC driver is downloaded HERE For PostgreSQL (finding version of DB) or ask your Database Administrator who manages this DB [root@sandbox lib]# su - postgres
-bash-4.1$ psql -c "select version()"
version
-------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.4.20 on x86_64-redhat-linux-gnu, compiled by GCC gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
(1 row)
-bash-4.1$
The version of PostgreSQL db here is 8.4.20. You can get the drivers for PostgreSQL JDBC HERE Additionally, ensure that only JDBC4 type of drivers are being used. In the error raised above JDBC3 type drivers were being used which are not as efficient as JDBC4. Download the relevant JDBC4 jars and replace them with the drivers in /usr/hdp/2.x.x.x.xxxx/hive/lib Ensure that hive.txn.manager is set to "org.apache.hadoop.hive.ql.lockmgr.DbTxnManager", then restart Ambari server and all other services which indicate that a restart is required Test out by running some queries via Hive View, Beeline OR Hive Shell
... View more
Labels:
- « Previous
-
- 1
- 2
- Next »