Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Missing stats in Impala

avatar
New Contributor

Hi,

 

I need to find the tables which don't have stats computed on them.

Basically, I have many tables which are being queried upon and have slow response because of missing statistics.
I need to find the tables with missing stats, so I can run "Compute Stats" on them.

Thanks

4 REPLIES 4

avatar

I'm afraid there is no way to get that information with a query today. You could write a script that iterates over all databases/tables ('show databases' and then 'show tables in <tbl>') and then does a 'show column stats' and 'show table stats' to see if column stats are there.

 

Be mindful that these 'show' commands will cause the table metadata to be loaded completely.

avatar
Explorer

Doesnt MySQL have PARAM_KEY = 'numRows'
and PARAM_VALUE = '-1' 

avatar

Yes, it is possible to query the backend Metastore DB directly, but the query might turn out to be rather elaborate because you need to cover the following stats that are interesting to Impala:

1. Table level row count

2. Partition level row counts

3. Column level stats such as NDV

avatar
New Contributor

So this begs the question how does Cloudera Manager produce the following chart:

impala.PNG

any reasson we cant use the SQL:

 

select query_duration from DEFAULT.IMPALA_QUERIES where serviceName="impala" and (stats_missing=true or stats_corrupt=true)

 

??