Reply
New Contributor
Posts: 1
Registered: ‎02-16-2017

Missing stats in Impala

Hi,

 

I need to find the tables which don't have stats computed on them.

Basically, I have many tables which are being queried upon and have slow response because of missing statistics.
I need to find the tables with missing stats, so I can run "Compute Stats" on them.

Thanks

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: Missing stats in Impala

I'm afraid there is no way to get that information with a query today. You could write a script that iterates over all databases/tables ('show databases' and then 'show tables in <tbl>') and then does a 'show column stats' and 'show table stats' to see if column stats are there.

 

Be mindful that these 'show' commands will cause the table metadata to be loaded completely.

New Contributor
Posts: 3
Registered: ‎04-04-2017

Re: Missing stats in Impala

Doesnt MySQL have PARAM_KEY = 'numRows'
and PARAM_VALUE = '-1' 

Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: Missing stats in Impala

Yes, it is possible to query the backend Metastore DB directly, but the query might turn out to be rather elaborate because you need to cover the following stats that are interesting to Impala:

1. Table level row count

2. Partition level row counts

3. Column level stats such as NDV

New Contributor
Posts: 1
Registered: ‎05-15-2017

Re: Missing stats in Impala

So this begs the question how does Cloudera Manager produce the following chart:

impala.PNG

any reasson we cant use the SQL:

 

select query_duration from DEFAULT.IMPALA_QUERIES where serviceName="impala" and (stats_missing=true or stats_corrupt=true)

 

??

Announcements