Member since
11-25-2018
6
Posts
0
Kudos Received
0
Solutions
11-02-2022
08:51 AM
> No metrics/graph to check " inc_stats_size" That's what I thought. > If 1GB is insufficient, Try to use "compute stats" instead of "compute incremental stats" However, there is a problem in that case. This table is updated every hour adding a new partition. But, the "compute stats" take well over an hour to complete.
... View more
11-02-2022
05:52 AM
Hi @ChethanYM I’m sorry for the late reply. Thank you for the helpful information. COMPUTE STATS Statement | 6.3.x | Cloudera Documentation https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/impala_compute_stats.html#compute_stats > If this metadata for all tables exceeds 2 GB, you might experience service downtime. In Impala 3.1 and higher, the issue was alleviated with an improved handling of incremental stats. As stated above, Impala service down (or just 1 Impalad down?) is my concern. Are there any metrics that would check the inc_stats_size? Also, what should we do if inc_stats_size_limit_bytes is insufficient even if it is 1GB? We assume that the number of columns and partitions is too large and therefore insufficient. In that case, how should we take countermeasures? Regards, yassan
... View more
10-16-2022
09:47 PM
Updating statistics using "COMPUTE INCREMENTAL STATS" produced the following error. Server version: impalad version 3.2.0-cdh6.3.2 RELEASE (build 1bb9836227301b839a32c6bc230e35439d5984ac)
Query: COMPUTE INCREMENTAL STATS hoge_db.huga_tbl PARTITION ( dt >= "20221015" )
ERROR: AnalysisException: Incremental stats size estimate exceeds 200.00MB. Please try COMPUTE STATS instead. In the error message, it says "Please try COMPUTE STATS instead.". I don't understand why. Wouldn't "COMPUTE STATS" give the same result?
... View more
Labels:
09-08-2022
07:23 PM
Hi @amallegni , Sorry for the delay in answering. And thank you for your response!
... View more
07-21-2022
09:34 AM
Hi @amallegni , Thank you for your response. My important points are as follows impala external table partitions Trying to drop partitions one by one work, but there are tens of partitions which should be removed and it would be quite tedious. I don't want to Dropping and recreating tables. is the only way to use Hive's "MSCK REPAIR TABLE tablename SYNC PARTITIONS" (i.e., can't Impala just complete it?) Please let me know if you have any solutions for the above.
... View more
06-28-2022
07:19 PM
Hi, All !
I am having the same problem as below.
Drop empty Impala partitions - Stack Overflow
Impala external table partitions still show up in stats with row count 0 after deleting the data in HDFS and altering (like ALTER TABLE table RECOVER PARTITIONS) refreshing (REFRESH table) and invalidation of metadata.
Trying to drop partitions one by one works, but there are tens of partitions which should be removed and it would be quite tedious.
Dropping and recreating the table would also be an option but that way all the statistics would be dropped together with the table.
Is there any kind of other options in impala to get this done?
Is this only a workaround using Hive below?
Found a workaround through HIVE.
By issuing MSCK REPAIR TABLE tablename SYNC PARTITIONS then refreshing the table in impala, the empty partitions disappear.
Also, I could not find an Issue in Impala JIRA. If anyone knows of an Issue in Impala JIRA, please let me know.
My environment is below.
CDH v6.3.2
reference:
REFRESH Statement | 6.3.x | Cloudera Documentation
ALTER TABLE Statement | 6.3.x | Cloudera Documentation
... View more
Labels: