Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Un-optimize queries are running on metastore_db that's are causing the slowness and high resource uses..

avatar
New Contributor

This is the query which is running on metastore_db

SELECT
`A0` . `PART_NAME`
FROM
`PARTITIONS` `A0`
LEFT OUTER JOIN
`TBLS` `B0`
ON
`A0` . `TBL_ID` = `B0` . `TBL_ID`
LEFT OUTER JOIN
`DBS` `C0`
ON
`B0` . `DB_ID` = `C0` . `DB_ID`
WHERE
`C0` . `NAME` = ?
AND `C0` . `CTLG_NAME` = ?
AND `B0` . `TBL_NAME` = ?
AND `A0` . `PART_NAME` LIKE ? ESCAPE ?


how we can optimize it ? Is it system generated?
or from where it is coming at metastore_db database that is inside mysql.

SELECT
"AO"
"PART NAME' FROM 'PARTITIONS
AO' LEFT OUTER JOIN 'TBLS
"RO' ON 'AO'
• 'TBL ID'

** Note ** I am using Trino hive architecture. 

Thanks in advance for your help....

4 REPLIES 4

avatar
Community Manager

@shrikantverma Welcome to the Cloudera Community!

To help you get the best possible solution, I have tagged our Hive and Spark experts  @james_jones @cravani @mszurap  who may be able to assist you further.

Please keep us updated on your post, and we hope you find a satisfactory solution to your query.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar

Hi @shrikantverma 

This indeed seems to be a generated query, however based on this only it is not clear if this is coming from the Hive Metastore. You can review with your DB tools which user is submitting these queries, for example for MySQL use "show processlist" or slow query logging to capture which user is submitting these. Alternatively stop the Hive metastore (if allowed) and observe if those queries are still appearing.

As I see this query might be used by HMS while listing partitions by name in MetaStoreDirectSql.java#getPartitionsUsingProjectionAndFilterSpec().

Is this inefficient in your database? Have you checked that with an "explain <query>" or why do you want to optimize this?

The HMS database should have indexes on the affected tables (DBS, TBLS, PARTITIONS) to speed up these queries, the only scan needed is to match the partition names - which can be slower if the Hive table has too many partitions.

Best regards

 Miklos

avatar
New Contributor

Hi had checked the "show processlist" and this query is running there and multiple time it's running problem is that it is scanning all rows, means not in optimize state
and if it is auto generated query then how we can pass the partition information.
and this query is not run by user, they are running optimize query it seems it taking some metadata from mysql, something like below.....
So that's why i was thinking it meta generated query. 
| id | select_type | table | partitions | type  | possible_keys                    | key             | key_len | ref         | rows    | filtered | Extra       |

+----+-------------+-------+------------+-------+----------------------------------+-----------------+---------+-------------+---------+----------+-------------+

|  1 | SIMPLE      | C0    | NULL       | const | PRIMARY,UNIQUE_DATABASE,CTLG_FK1 | UNIQUE_DATABASE | 389     | const,const |       1 |   100.00 | Using index |

|  1 | SIMPLE      | B0    | NULL       | const | PRIMARY,UNIQUETABLE,TBLS_N49     | UNIQUETABLE     | 268     | const,const |       1 |   100.00 | Using index |

|  1 | SIMPLE      | A0    | NULL       | ref   | PARTITIONS_N49                   | PARTITIONS_N49  | 9       | const       | 5555098 |    11.11 | Using where

avatar
Community Manager

@shrikantverma Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  Thanks.


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: