Member since
10-10-2018
8
Posts
0
Kudos Received
0
Solutions
01-23-2020
06:04 AM
Thank you for that explanation
... View more
01-21-2020
06:27 AM
We are still using cdh 5.10 and impala 2.7 and there is a startup option `inc_stats_size_limit_bytes` which is described in https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_perf_stats.html section "Maximum Serialized Stats Size" (which is identical to 5.10 docs) but it does not really describe what this setting means and how to predict that this is a problem. > The inc_stats_size_limit_bytes limit is set as a safety check, to prevent Impala from hitting the maximum limit for the table metadata. Note that this limit is only one part of the entire table's metadata all of which together must be below 2 GB. With pretty big tables and a lot of live partitions we had to increase catalogd memory to be able to cope with the amount of metadata. Now additionally singular large tables were not loaded during compute incremental stats because they reached inc_stats_size_limit_bytes which is far less than catalogd memory. But what does that mean? How can we calculate this limit or predict it? We didn't find any metrics in https://docs.cloudera.com/documentation/enterprise/5-10-x/topics/cm_metrics_impala_catalog_server.html nor can we calculate the expected limit / restriction by this config through anything found in https://docs.huihoo.com/cloudera/The-Impala-Cookbook.pdf (which works just fine for catalogd expected memory). In short: - What does inc_stats_size_limit_bytes mean - can we predict / calculate a needed value for inc_stats_size_limit_bytes for our tables
... View more
Labels:
- Labels:
-
Apache Impala