10-04-2016 05:42 AM
I am using CDH 5.7 and alter statements used to take long time in the beginning. At that time, I didn't investigated enough to understand the reason. These days started seeing slowness on create, drop etc statements as well to greater extent. After digging into this, come to know that catalogd is taking too much time (evident from the logs timings). While trying to analyse the reason, come to know that memory is utilized fully using "mem_rss" metric. But not sure whether this is the exact cause. Inaddition, I am seeing slowness even for "DESC" statements. But, I don't think impalad is making call to catalogd in this case.
I tried increasing memory from 3 GB to 5 GB for catalogd and able to see the improvements to greater extent. Relatively, am seeing good difference on DDL query execution. After restarting catalogd service, memory dropped to approx 2GB and it started increasingly from there and finally saturated around 5 GB steadily. Even now, graph shows around 5 GB steadily. Inaddition, I did some cleanup on very small files. There was an table which has parq file for every record. Merged those small files by creating similar table and dropped the older one. After this change, No. of files in /user/hive/warehouse/ dropped from 2.1L to 90K. But this didn't reflected on memory consumption graph immediately, which I was expecting. After restarting, I was able to see some reduction. However, it has grown slowly and saturated around 5 GB over 2-3 days.
In this setup, have partioned and non-partioned tables as well. Only column stats are generated for Non-paritioned tables. Also, have one table with 20K partitions. Total No. of files and blocks is around 90K.
I am trying to understand the catalog memory usage (especially cache contents) so that this can be avoided in future because not sure when catalog performance degrades again. Also, After restart, Why the graph doesn't start from zero (more or less) instead of 2 GB? Is it because of load_catalog_in_background being set to true? Also, There are some jira's in cloudera recommending set to false, but in cdh 5.7, by default it is ON.
(Posted similar comments on https://issues.cloudera.org/browse/IMPALA-1480)
02-10-2018 06:56 AM
Does your state store and impalad catalog are on the same host ?
what is the host level memory ?
what are all the other roles that are runining on the host ?
Did you check the number of connections object ?