Created 11-11-2016 04:20 PM
I have a table with a few thousand rows in a small dev cluster and I am running a count on a ORC table and Tez is enabled.
The count takes a really really really long time or times out.
Any suggestions?
Created 11-11-2016 04:24 PM
@Timothy Spann counts on ORC tables should be fast as it can use the strip footer info and run much faster. Have you run stats on the table?
Created 11-11-2016 04:42 PM
Have you run analyze table <tablename> compute statistics for columns?
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables
Created 11-11-2016 05:36 PM
Can you help with the explain plan along with hive -hiveconf hive.root.logger=debug,console -e 'query' output
Created 04-22-2018 11:26 PM
Hi have you solve this problem? I am also having same problem as yours.
Created 04-23-2018 07:44 AM
If you are only interested in the number and not to display all lines in the table then try
select count(1) from table_name;
That should be faster, hope that helps !!!
Created 04-23-2018 07:44 AM
If you are only interested in the number and not to display all lines in the table then try
select count(1) from table_name;
That should be faster, hope that helps !!!
Created 09-25-2018 12:47 PM
@Geoffrey Shelton Okot it is also taking the same time. I run query select count(ayx) from test_table. It took 17 Sec. and while i run select count(1) from test_table, the time remains same.
Could you please explore more about hive query fast processing.
Thanks
Created 04-18-2019 06:17 PM
This is an old thread, but for anyone looking at it, count(1) is the same query as count(*), so there are no performance benefits to using one over the other.