Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala Catalogue Server OOM

avatar
Rising Star

Hello,

We are observing our Impala Catalogue Server’s process frequently gets exited / killed (This role encountered 1 unexpected exit(s) in the previous 5 minute(s).This included 1 exit(s) due to OutOfMemory errors. Critical threshold: any.)

I was going through this article (https://community.cloudera.com/t5/Support-Questions/Cloudera-6-2-1-Impala-GC-Overhead-limit-Exceeded...) it seems to be Heap Memory related, would like to know if there is any way / calculations to find how much Heap should be allocated to avoid these issues.

 

CM / CDH 5.16.2

Java Heap Size of Catalog Server in Bytes = 15Gb

 

Appreciate any guidance in this regard.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

There is a calculation in The Impala Cookbook to estimate the heap memory usage for metadata:

 

• num of tables * 5KB + num of partitions * 2KB + num of files * 750B + num of file blocks * 300B + sum(incremental col stats per table)

Incremental stats

     For each table, num columns * num partitions * 400B

 

Usually, the insufficient catalog heap memory is caused by a large number of small files or/and partitions. For example, a single 512MB file needs (750B + 4 * 3 * 300B) = 4350B. But if we split this file to 128 4MB files, these files will use (128 * 750B + 128 * 3 * 300B) = 211200B, nearly 50 times!

 

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

There is a calculation in The Impala Cookbook to estimate the heap memory usage for metadata:

 

• num of tables * 5KB + num of partitions * 2KB + num of files * 750B + num of file blocks * 300B + sum(incremental col stats per table)

Incremental stats

     For each table, num columns * num partitions * 400B

 

Usually, the insufficient catalog heap memory is caused by a large number of small files or/and partitions. For example, a single 512MB file needs (750B + 4 * 3 * 300B) = 4350B. But if we split this file to 128 4MB files, these files will use (128 * 750B + 128 * 3 * 300B) = 211200B, nearly 50 times!

 

avatar
Expert Contributor

@Amn_468 

Please refer the below Knowledge base article for heap memory calculation. 

https://my.cloudera.com/knowledge/Impaladembeddedjvmheapsize-has-a-default-value-of-32-GB-after?id=2...

avatar
Community Manager

@Amn_468 Has any of the replies helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.  



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: