Created on 05-21-2024 05:45 AM - last edited on 07-22-2024 12:48 PM by cjervis
This is the query which is running on metastore_db
SELECT
`A0` . `PART_NAME`
FROM
`PARTITIONS` `A0`
LEFT OUTER JOIN
`TBLS` `B0`
ON
`A0` . `TBL_ID` = `B0` . `TBL_ID`
LEFT OUTER JOIN
`DBS` `C0`
ON
`B0` . `DB_ID` = `C0` . `DB_ID`
WHERE
`C0` . `NAME` = ?
AND `C0` . `CTLG_NAME` = ?
AND `B0` . `TBL_NAME` = ?
AND `A0` . `PART_NAME` LIKE ? ESCAPE ?
how we can optimize it ? Is it system generated?
or from where it is coming at metastore_db database that is inside mysql.
SELECT
"AO"
"PART NAME' FROM 'PARTITIONS
AO' LEFT OUTER JOIN 'TBLS
"RO' ON 'AO'
• 'TBL ID'
** Note ** I am using Trino hive architecture.
Thanks in advance for your help....
Created 05-21-2024 11:21 AM
@shrikantverma Welcome to the Cloudera Community!
To help you get the best possible solution, I have tagged our Hive and Spark experts @james_jones @cravani @mszurap who may be able to assist you further.
Please keep us updated on your post, and we hope you find a satisfactory solution to your query.
Regards,
Diana Torres,Created 05-22-2024 02:14 AM
This indeed seems to be a generated query, however based on this only it is not clear if this is coming from the Hive Metastore. You can review with your DB tools which user is submitting these queries, for example for MySQL use "show processlist" or slow query logging to capture which user is submitting these. Alternatively stop the Hive metastore (if allowed) and observe if those queries are still appearing.
As I see this query might be used by HMS while listing partitions by name in MetaStoreDirectSql.java#getPartitionsUsingProjectionAndFilterSpec().
Is this inefficient in your database? Have you checked that with an "explain <query>" or why do you want to optimize this?
The HMS database should have indexes on the affected tables (DBS, TBLS, PARTITIONS) to speed up these queries, the only scan needed is to match the partition names - which can be slower if the Hive table has too many partitions.
Best regards
Miklos
Created on 05-27-2024 08:39 PM - edited 05-27-2024 08:43 PM
Hi had checked the "show processlist" and this query is running there and multiple time it's running problem is that it is scanning all rows, means not in optimize state
and if it is auto generated query then how we can pass the partition information.
and this query is not run by user, they are running optimize query it seems it taking some metadata from mysql, something like below.....
So that's why i was thinking it meta generated query.
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+----------------------------------+-----------------+---------+-------------+---------+----------+-------------+
| 1 | SIMPLE | C0 | NULL | const | PRIMARY,UNIQUE_DATABASE,CTLG_FK1 | UNIQUE_DATABASE | 389 | const,const | 1 | 100.00 | Using index |
| 1 | SIMPLE | B0 | NULL | const | PRIMARY,UNIQUETABLE,TBLS_N49 | UNIQUETABLE | 268 | const,const | 1 | 100.00 | Using index |
| 1 | SIMPLE | A0 | NULL | ref | PARTITIONS_N49 | PARTITIONS_N49 | 9 | const | 5555098 | 11.11 | Using where
Created 05-27-2024 08:28 PM
@shrikantverma Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks.
Regards,
Diana Torres,