Support Questions

Find answers, ask questions, and share your expertise

[CDH 5.10 upgrade] Wrong FS Hive tables

avatar
Expert Contributor

Hi,

 

I just upgraded my cluster from 5.9 to 5.10 last night. Now I am seeing the problem of "Wrong FS" with the famouse duplicate "8020:8020" in my exsiting tables HDFS URI. 

The new tables are fine. So it means something went wrong during the upgrade. 

 

I've seen one solution is to alter the location of the exisitng tables. But my problem is, due to the wrong fs the alter and drop both fail (same error for drop table):

 

hive> alter tablemytable set location "hdfs://hadoop-master-1:8020/user/maziyar/warehouse/mytable";
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. Got exception: java.io.IOException Incomplete HDFS URI, no host: hdfs://hadoop-master-1:8020:8020/user/maziyar/warehouse/mytable

 

My question is, how can I fix this? Neither alter nor drop works. I am kind of stuck 🙂

 

PS: I upgraded my cluster from parcel. Everything else seems fine so far.

 

Many thanks.

Maziyar

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Ok, the only thing worked.

 

I update the table DBS in warehouse databse in MySQL with a correct URI. Then the alter table .. set location worked on all the exisitng tables.

 

So I am not sure if there is a bug in "/usr/lib/cmf/service/hive/hive.s" when you use "Update Hive Metastore NameNodes" or this only should be enabled when you have HA enabled (I didn't!). Either way, this added the duplicate ports.

 

 

Best,

Maziyar

View solution in original post

8 REPLIES 8

avatar
Expert Contributor

I have more info. After upgrading to CDH 5.10 I ran "Update Hive Metastore NameNodes" from Cloudera Manager. And that made the duplicates port into HiveMetaTool.

 

I checked with the new table that was working, after updaing metastore namenodes now it has duplicate port in its URI.

 

Is there a way to fix this in:

/usr/lib/cmf/service/hive/hive.sh

 

Many thanks,

maziyar

avatar
Expert Contributor

I also tried metatool to updatehe locaton but it didn't work.

 

hive --config /etc/hive/conf/conf.server --service metatool -updateLocation "hdfs://hadoop-master-1:8020" "hdfs://hadoop-master-1:8020:8020"
Initializing HiveMetaTool..
HiveMetaTool:A valid host is required in both old-loc and new-loc

 

Ok now I tried everything possible. No way to update location nor drop the tables.

avatar
Expert Contributor

Ok, the only thing worked.

 

I update the table DBS in warehouse databse in MySQL with a correct URI. Then the alter table .. set location worked on all the exisitng tables.

 

So I am not sure if there is a bug in "/usr/lib/cmf/service/hive/hive.s" when you use "Update Hive Metastore NameNodes" or this only should be enabled when you have HA enabled (I didn't!). Either way, this added the duplicate ports.

 

 

Best,

Maziyar

avatar
Champion
I am going to go with bug and will try to test this out to confirm so an official bug report can be submitted. I think I ran into something similar. I wasn't upgrading; fresh install. I did restore a metastore from an older version and ran the same command to update NN URI for Hive. The tables were updated correctly if I recall correctly but the DB locations were not. They had the same double port entry made.

I did the same method to fix by updating the DBS table in the metastore DB directly; I didn't no try the other command you listed.

This was on CDH 5.8.2.

avatar
Rising Star

I would say this is a bug. If the user isnt supposed to be performing a certain action (updating the name node URIs in this case), then either the UI should have prevented the user from performing the action or should have done nothing during the action if the cluster was not HA-enabled for NN. Manipulating the metadata is bad. I will file an internal jira. Thank you for reporting.

avatar
Expert Contributor

I am not sure if this was resolved but after i upgraded from 5.11.1 to 5.13.0 i am seeing this error in spark2-shell

 

scala> spark.sqlContext.sql("CREATE TABLE IF NOT EXISTS default.employee_test123(id INT, name STRING, age INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'")
java.lang.IllegalArgumentException: Wrong FS: hdfs://abc23.xxx.com:8020/user/hive/warehouse/employee_test123, expected: hdfs://nameservice1
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:662)
at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:482)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply$mcV$sp(HiveExternalCatalog.scala:231)
at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$createTable$1.apply(HiveExternalCatalog.scala:200)

 

I have followed the instructions in https://www.cloudera.com/documentation/enterprise/5-11-x/topics/cdh_hag_hdfs_ha_cdh_components_confi... but still seeing the same issue. 

avatar
Explorer

i met such error too, and the solution is update the metastore database with sql,though i use oracle not mysql.

avatar
Expert Contributor

Unfortunately this happened again in 5.13.1 as I did "Update Hive Metastore NameNodes" and it added the port twice.