Hi Jais,

I recommend that you consider updating the column stats (#distinct values per column), and the table stats (table/partition row counts) separately. Compute stats is mostly expensive due to computing column stats, but the number of distinct values typically changes much slower than the row count.

What you can do is run the full compute stats less frequently (e.g., once your table size has doubled).

You can update the table stats (row counts) in a much cheaper way by running select count(*) and using ALTER TABLE to manually set the new row count.

The pocedure is described in more detail here:

http://www.cloudera.com/documentation/enterprise/5-5-x/topics/impala_perf_stats.html#perf_stats

Alex