Support Questions

Find answers, ask questions, and share your expertise

Impala Lock contention error while running Refresh Command

avatar
Rising Star

Hello,
We are observing Impala CatalogException: Error refreshing metadata for table due to lock contention errors on our Cluster.

Came across this article - https://community.cloudera.com/t5/Support-Questions/Impala-Error-updating-the-catalog-due-to-lock-co... upon checking the suggestions made by Cloudera Tech - load_catalog_in_background is already unchecked nor do we find any error for JVM Pause in Catalogue Logs
Java Heap Size of Catalog Server in Bytes = 28G
CDH 6.3.3

Would appreciate any help in understanding why this happens and ways to fix this.

1 ACCEPTED SOLUTION

avatar

Hi @Amn_468 ,

The lock contention happens when there are too many "invalidate metadata" (IM) and "refresh" commands running. The catalog daemon's responsibility is to load the Hive Metastore metadata (hive table and partition information, including stats) and the HDFS metadata (list of files and their block locations). If a table is refreshed (or a table is loaded for the first time after an IM) then catalogd has to load these metadata information, and has some built-in limits and has a max throughput how many tables and/or partitions/files it can handle (load). While doing so it needs to maintain a lock on the "catalog update", to avoid simultaneous requests to overwrite the previously collected information.

So if there are concurrent and long running "refresh" statements [1], then those can block each other and cause a delay in the publishing of the catalog information.

What can be done is to:

- reduce the number of IM calls

- reduce the number of refresh calls

- wherever it is possible, use refresh on partition level only

- There were some improvements in IMPALA-6671, which is available in CDP 7.1.7 SP1 version, so an upgrade could also help (it still cannot completely help with high frequency, heavy refreshes)

 

I hope this can help the discussions with the users/teams how frequently and when are they submitting the refresh queries.

 

 Miklos

Customer Operations Engieer, Cloudera

 

[1] https://impala.apache.org/docs/build3x/html/topics/impala_refresh.html

View solution in original post

2 REPLIES 2

avatar

Hi @Amn_468 ,

The lock contention happens when there are too many "invalidate metadata" (IM) and "refresh" commands running. The catalog daemon's responsibility is to load the Hive Metastore metadata (hive table and partition information, including stats) and the HDFS metadata (list of files and their block locations). If a table is refreshed (or a table is loaded for the first time after an IM) then catalogd has to load these metadata information, and has some built-in limits and has a max throughput how many tables and/or partitions/files it can handle (load). While doing so it needs to maintain a lock on the "catalog update", to avoid simultaneous requests to overwrite the previously collected information.

So if there are concurrent and long running "refresh" statements [1], then those can block each other and cause a delay in the publishing of the catalog information.

What can be done is to:

- reduce the number of IM calls

- reduce the number of refresh calls

- wherever it is possible, use refresh on partition level only

- There were some improvements in IMPALA-6671, which is available in CDP 7.1.7 SP1 version, so an upgrade could also help (it still cannot completely help with high frequency, heavy refreshes)

 

I hope this can help the discussions with the users/teams how frequently and when are they submitting the refresh queries.

 

 Miklos

Customer Operations Engieer, Cloudera

 

[1] https://impala.apache.org/docs/build3x/html/topics/impala_refresh.html

avatar
Rising Star

@mszurap 

Thanks for your detailed explanation this certainly helps.