Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How do you speed up count(*) on tables in Hive

How do you speed up count(*) on tables in Hive

Super Guru

I have a table with a few thousand rows in a small dev cluster and I am running a count on a ORC table and Tez is enabled.

The count takes a really really really long time or times out.

Any suggestions?

8 REPLIES 8

Re: How do you speed up count(*) on tables in Hive

Super Guru

@Timothy Spann counts on ORC tables should be fast as it can use the strip footer info and run much faster. Have you run stats on the table?

Re: How do you speed up count(*) on tables in Hive

@Timothy Spann

Have you run analyze table <tablename> compute statistics for columns?

https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables

Re: How do you speed up count(*) on tables in Hive

Guru

@Timothy Spann

Can you help with the explain plan along with hive -hiveconf hive.root.logger=debug,console -e 'query' output

Re: How do you speed up count(*) on tables in Hive

New Contributor

Hi have you solve this problem? I am also having same problem as yours.

Re: How do you speed up count(*) on tables in Hive

Mentor

@Christian Lunesa

If you are only interested in the number and not to display all lines in the table then try

select count(1) from table_name;

That should be faster, hope that helps !!!

Re: How do you speed up count(*) on tables in Hive

Mentor

@Christian Lunesa

If you are only interested in the number and not to display all lines in the table then try

select count(1) from table_name;

That should be faster, hope that helps !!!

Highlighted

Re: How do you speed up count(*) on tables in Hive

New Contributor

@Geoffrey Shelton Okot it is also taking the same time. I run query select count(ayx) from test_table. It took 17 Sec. and while i run select count(1) from test_table, the time remains same.

Could you please explore more about hive query fast processing.

Thanks

Re: How do you speed up count(*) on tables in Hive

New Contributor

This is an old thread, but for anyone looking at it, count(1) is the same query as count(*), so there are no performance benefits to using one over the other.