Created on 05-06-2015 06:47 AM - edited 09-16-2022 02:28 AM
I have upgraded from CDH 5.3 to CDH 5.4.
While executing a simple select statement in Hive it is erroring out:
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
hive> select * from employees;
FAILED: SemanticException Unable to determine if hdfs://<hive-master>:8020:8020/user/hive/warehouse/employees is encrypted:
java.lang.IllegalArgumentException: Wrong FS: hdfs://<hive-master>:8020:8020/user/hive/warehouse/employees, expected: hdfs://<hive-master>:8020
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Created 05-06-2015 10:44 AM
Created 05-06-2015 07:29 AM
Port 8020 is used by NameNode.
I am not sure why the HDFS path has it twice!!!
Does it pick up the "fs.defaultFS" property from HDFS service?
Created 05-06-2015 07:41 AM
Created 05-06-2015 07:49 AM
/user/hive/warehouse
Created 05-06-2015 09:57 AM
Created 05-06-2015 10:12 AM
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://<FQN-host-name>:8020</value>
</property>
.................................
.................................
.................................
hive> describe formatted employees;
OK
# col_name data_type comment
emp_id int
name string
salary double
# Detailed Table Information
Database: default
Owner: dast
CreateTime: Thu Apr 09 14:57:46 EDT 2015
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://<host-name>:8020:8020/user/hive/warehouse/employees
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
comment This is the employees table
numFiles 1
numRows 0
rawDataSize 0
totalSize 142
transient_lastDdlTime 1428607184
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
serialization.format ,
Time taken: 0.405 seconds, Fetched: 35 row(s)
Created 05-06-2015 10:22 AM
Created 05-06-2015 10:35 AM
I have just created a brand new table.
The HDFS location/path has the port 8020 ONE time!!!!
How can I revert the existing tables back to report port 8020 one time and not twice????
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
hive> select * from users;
OK
100 User1 passwd1
200 User2 passwd2
300 User3 passwd3
400 User4 passwd4
500 User5 passwd5
600 User6 passwd6
Time taken: 0.073 seconds, Fetched: 6 row(s)
hive> describe formatted users;
OK
# col_name data_type comment
user_id int
username string
passwd string
# Detailed Table Information
Database: default
Owner: dast
CreateTime: Wed May 06 13:32:05 EDT 2015
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: hdfs://<host-name>:8020/user/hive/warehouse/users
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
comment This is the users table
numFiles 1
totalSize 121
transient_lastDdlTime 1430933556
# Storage Information
SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat: org.apache.hadoop.mapred.TextInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
field.delim ,
serialization.format ,
Time taken: 0.067 seconds, Fetched: 33 row(s)
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Created 05-06-2015 10:40 AM
All the existing Hive tables showing up the 8020 port twice in their HDFS Location!!!
What might have caused this, during the CDH 5.4 upgrade???
Thanks for your assistance!
Created 05-06-2015 10:44 AM