Support Questions

Amn_468 · ‎06-03-2022

Hello,
We are observing Impala CatalogException: Error refreshing metadata for table due to lock contention errors on our Cluster.

Came across this article - https://community.cloudera.com/t5/Support-Questions/Impala-Error-updating-the-catalog-due-to-lock-co... upon checking the suggestions made by Cloudera Tech - load_catalog_in_background is already unchecked nor do we find any error for JVM Pause in Catalogue Logs
Java Heap Size of Catalog Server in Bytes = 28G
CDH 6.3.3

Would appreciate any help in understanding why this happens and ways to fix this.

mszurap · ‎06-03-2022

Hi @Amn_468 ,

The lock contention happens when there are too many "invalidate metadata" (IM) and "refresh" commands running. The catalog daemon's responsibility is to load the Hive Metastore metadata (hive table and partition information, including stats) and the HDFS metadata (list of files and their block locations). If a table is refreshed (or a table is loaded for the first time after an IM) then catalogd has to load these metadata information, and has some built-in limits and has a max throughput how many tables and/or partitions/files it can handle (load). While doing so it needs to maintain a lock on the "catalog update", to avoid simultaneous requests to overwrite the previously collected information.

So if there are concurrent and long running "refresh" statements [1], then those can block each other and cause a delay in the publishing of the catalog information.

What can be done is to:

- reduce the number of IM calls

- reduce the number of refresh calls

- wherever it is possible, use refresh on partition level only

- There were some improvements in IMPALA-6671, which is available in CDP 7.1.7 SP1 version, so an upgrade could also help (it still cannot completely help with high frequency, heavy refreshes)

I hope this can help the discussions with the users/teams how frequently and when are they submitting the refresh queries.

Miklos

Customer Operations Engieer, Cloudera

[1] https://impala.apache.org/docs/build3x/html/topics/impala_refresh.html

View solution in original post

mszurap · ‎06-03-2022

Hi @Amn_468 ,

The lock contention happens when there are too many "invalidate metadata" (IM) and "refresh" commands running. The catalog daemon's responsibility is to load the Hive Metastore metadata (hive table and partition information, including stats) and the HDFS metadata (list of files and their block locations). If a table is refreshed (or a table is loaded for the first time after an IM) then catalogd has to load these metadata information, and has some built-in limits and has a max throughput how many tables and/or partitions/files it can handle (load). While doing so it needs to maintain a lock on the "catalog update", to avoid simultaneous requests to overwrite the previously collected information.

So if there are concurrent and long running "refresh" statements [1], then those can block each other and cause a delay in the publishing of the catalog information.

What can be done is to:

- reduce the number of IM calls

- reduce the number of refresh calls

- wherever it is possible, use refresh on partition level only

- There were some improvements in IMPALA-6671, which is available in CDP 7.1.7 SP1 version, so an upgrade could also help (it still cannot completely help with high frequency, heavy refreshes)

I hope this can help the discussions with the users/teams how frequently and when are they submitting the refresh queries.

Miklos

Customer Operations Engieer, Cloudera

[1] https://impala.apache.org/docs/build3x/html/topics/impala_refresh.html

Amn_468 · ‎06-05-2022

@mszurap

Thanks for your detailed explanation this certainly helps.

Cloudera Community

Support Questions

Impala Lock contention error while running Refresh Command

hivemetastore dead lock while running compaction

Error while running Hive PreUpgrade command

M3 Release Bug : HTTP ERROR 500 Content preparati...

Hive Locks with ACID Enabled

Kafka log-dir .lock exception

Issue running Impala Shell Commands with Oozie

How to connect to CDW (Impala) to return actively ...

What does mean "skipping topic update due to lock ...

Command Line Arguments Run Template

Support Video: Hive Locking Mechanism