Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

(UPDATE) Hive3 Metastore occasionally isn't deleting from TBLS when a drop table is executed

Highlighted

(UPDATE) Hive3 Metastore occasionally isn't deleting from TBLS when a drop table is executed

Explorer

UPDATE: I'm wondering if this is happening on deletion, not creation. There is a describe that runs before the external table creation and it's throwing the same error. Checking binlogs in mysql for any clues.

 

UPDATE2: After reading the binlog, I am more convinced the DROP TABLE <TABLE> cmd for hive isn't complete correctly and isn't delete from TBLS.

--------------------------------------------------------------

This is a behavior I've experienced frequently in Hive 3, that occurred almost never in Hive.

Hive Version: 3.0.0.3.1

Metastore is mysql.

 

As part of an incremental data sqoop, I create two tables:One external, one ORC.

Sometimes during the creation of these two tables, the metastore doesn't populate the the SERDES, SDS, CDS tables with the corresponding metadata and leaves the SD_ID null in the TBLS table. The table remain non-interactive (not even droppable) until I populate dummy data in the corresponding tables and update the TBLS's SD_ID field.

 

The statements for table creation are as such:
EXTERNAL
create external table if not exists <DIFF EXTERNAL TABLE NAME> like <MAIN EXTERNAL TABLE NAME>;
ORC
create table <DIFF ORC TABLE> stored as orc as select * from <DIFF EXTERNAL TABLE NAME>;

The logs in the metastore are a bit unclear:

 

 

 

 

2019-11-13T04:08:07,696 ERROR [pool-6-thread-187]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - java.lang.NullPointerException

2019-11-13T04:08:07,696 ERROR [pool-6-thread-187]: server.TThreadPoolServer (TThreadPoolServer.java:run(297)) - Error occurred during processing of message.

 

 

 

 



And the logs in the Hiveserver2 logs only highlight after the issue has happened.

2 REPLIES 2
Highlighted

Re: (UPDATE) Hive3 Metastore occasionally isn't deleting from TBLS when a drop table is executed

Mentor

@Eric_B 

 

An external table is not managed by Hive it only describes the metadata/schema on external files. External table files can be accessed and managed by processes outside of Hive. External tables can access data stored in sources such as CSV's, S3, Azure Storage or remote HDFS locations.

 

When you drop /delete the external table you are ONLY invalidating the metadata but the underlying data in S3 or CSV file is not deleted as opposed to a managed table.

Please read create, use, and drop an external table it will give you a better explanation

 

Happy Hadoop

Highlighted

Re: (UPDATE) Hive3 Metastore occasionally isn't deleting from TBLS when a drop table is executed

Explorer

@Shelton Hi thank you for the response.

 

To clarify, I'm not talking about the data the tables are displaying. I'm talking about the actual metastore tables (TBLS, CDS, SDS, SERDES, etc.). These tables describe the table structure and if there is something wrong with the metadata, the tables won't function accordingly.

 

The issue, I think, is when a table is dropped, the sql cmds to the metastore are fully executing (Not deleting from TBLS, but leaving the SD_ID null.

 

The concern is the metadata itself, not the underlining csv/flatfile.

Don't have an account?
Coming from Hortonworks? Activate your account here