Support Questions
Find answers, ask questions, and share your expertise

COMPUTE STATS - Database scripts

New Contributor

Hi Team we have a new application installed on our Cloudera Cluster which can read from a database and import all tables into Hive/Impala without us writing any CREATE statements. Once the tables are created we need to perform COMPUTE STATS at Impala level and ANALYZE Table COMPUTE STATISTICS at hive level. Is there a specific way to script this? OR we need to put all COMPUTE STATS statements in one script and run it on Impala/hive command line? I know hive and impala do not have SYS tables where we can get list of each tables. Ideally we would want to schedule the COMPUTE STATS for both Hive and Impala post all our weekly loads are done. The script should be able to dynamically pick all NEW tables created in Hive/Impala across databases and perform the Compute stats. Appreciate any help. Regards JitenR


Expert Contributor

Hi @JitenR,


You will need run INVALIDATE METADATA weekly in your script and then recovery with SHOW TABLES all new entrys for execute COMPUTE STATS.




New Contributor

Thank You Manu. I was able to write the script and automate the process.



Jitender Renwa


To  add to your his points .


If your table is partioned then you use the below - quoting it from Cloudera KB 

The COMPUTE INCREMENTAL STATS variation is a shortcut for partitioned tables that works on a subset of partitions rather than the entire table

Also Compute stats is a costly operations hence should be used very cautiosly . 

Some impala query may fail while performing compute stats . Hence chose Refresh command vs Compute stats accordingly . Hope this helps. Although you had done with your script 🙂 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.