Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Missing stats in Impala

Missing stats in Impala

New Contributor

Hi,

 

I need to find the tables which don't have stats computed on them.

Basically, I have many tables which are being queried upon and have slow response because of missing statistics.
I need to find the tables with missing stats, so I can run "Compute Stats" on them.

Thanks

4 REPLIES 4

Re: Missing stats in Impala

Master Collaborator

I'm afraid there is no way to get that information with a query today. You could write a script that iterates over all databases/tables ('show databases' and then 'show tables in <tbl>') and then does a 'show column stats' and 'show table stats' to see if column stats are there.

 

Be mindful that these 'show' commands will cause the table metadata to be loaded completely.

Highlighted

Re: Missing stats in Impala

Explorer

Doesnt MySQL have PARAM_KEY = 'numRows'
and PARAM_VALUE = '-1' 

Re: Missing stats in Impala

Master Collaborator

Yes, it is possible to query the backend Metastore DB directly, but the query might turn out to be rather elaborate because you need to cover the following stats that are interesting to Impala:

1. Table level row count

2. Partition level row counts

3. Column level stats such as NDV

Re: Missing stats in Impala

New Contributor

So this begs the question how does Cloudera Manager produce the following chart:

impala.PNG

any reasson we cant use the SQL:

 

select query_duration from DEFAULT.IMPALA_QUERIES where serviceName="impala" and (stats_missing=true or stats_corrupt=true)

 

??

Don't have an account?
Coming from Hortonworks? Activate your account here