Support Questions

Find answers, ask questions, and share your expertise

Hive drop partitions using range impacts metastore

avatar
Explorer

I'm using Hadoop 2.6.0-cdh5.14.2, Hive 1.1.0-cdh5.14.2
In this system a huge table with 183K+ partition does exist, it is an external table and the command:

 

0: jdbc:hive2://hiveserver2.hd.docomodigital.> drop table unifieddata_work.old__raw_ww_eventsjson

does not work, the metastore does not reply within 600 seconds and the task ends with error.

I attempted to delete the partitions using the range:

 

0: jdbc:hive2://hiveserver2.hd.docomodigital.> alter table unifieddata_work.old__raw_ww_eventsjson drop PARTITION (country='ae', year='2017', month='01', day>'29', hour > '00' );
INFO : Compiling command(queryId=hive_20180821140909_ba6c4bb0-d0de-4fd3-a5ec-47e217289c6b): alter table unifieddata_work.old__raw_ww_eventsjson drop PARTITION (country='ae', year='2017', month='01', day>'29', hour > '00' )
INFO : Semantic Analysis Completed
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20180821140909_ba6c4bb0-d0de-4fd3-a5ec-47e217289c6b); Time taken: 0.612 seconds
INFO : Executing command(queryId=hive_20180821140909_ba6c4bb0-d0de-4fd3-a5ec-47e217289c6b): alter table unifieddata_work.old__raw_ww_eventsjson drop PARTITION (country='ae', year='2017', month='01', day>'29', hour > '00' )
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=01
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=02
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=03
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=04
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=05
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=06
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=07
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=08
INFO : Dropped the partition country=ae/year=2017/month=01/day=30/hour=09
... CUTTED HERE ...

It works but something bad happens to Metastore: canary stops working. Any idea about how to solve the issue? Is there an alternative way to delete such a big table?

4 REPLIES 4

avatar

Hi,

 try to log into the Metastore database and manually remove the table from the metadata tables. 

I think the table containing tables is TBLS. You should also remove the records from child tables, such as columns and locations.

 

Then restart the metastore, and it should ok.

 

As this is an external table, you will not remove the data with this action.

 

avatar
Explorer
It’s not a solution we can adopt in a production environment where the same problem can face again.
Any other idea?

avatar

I understand. But how many times you create and drop ta table with 180k partitions? It is a matter of simple script. But maybe you are right, the metastore should handle bigger timeouts.

avatar
Explorer
There are HUGE tables with a lot of partitions, this case is not unique. Partitioning helps to address the slice of data that matters and each partition contains a lot of data. Oh well... it's BigData at all.