Support Questions
Find answers, ask questions, and share your expertise

How do you speed up count(*) on tables in Hive

Highlighted

How do you speed up count(*) on tables in Hive

Super Guru

I have a table with a few thousand rows in a small dev cluster and I am running a count on a ORC table and Tez is enabled.

The count takes a really really really long time or times out.

Any suggestions?

8 REPLIES 8
Highlighted

Re: How do you speed up count(*) on tables in Hive

Super Guru

@Timothy Spann counts on ORC tables should be fast as it can use the strip footer info and run much faster. Have you run stats on the table?

Highlighted

Re: How do you speed up count(*) on tables in Hive

@Timothy Spann

Have you run analyze table <tablename> compute statistics for columns?

https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables

Highlighted

Re: How do you speed up count(*) on tables in Hive

Guru

@Timothy Spann

Can you help with the explain plan along with hive -hiveconf hive.root.logger=debug,console -e 'query' output

Highlighted

Re: How do you speed up count(*) on tables in Hive

Hi have you solve this problem? I am also having same problem as yours.

Highlighted

Re: How do you speed up count(*) on tables in Hive

Mentor

@Christian Lunesa

If you are only interested in the number and not to display all lines in the table then try

select count(1) from table_name;

That should be faster, hope that helps !!!

Highlighted

Re: How do you speed up count(*) on tables in Hive

Mentor

@Christian Lunesa

If you are only interested in the number and not to display all lines in the table then try

select count(1) from table_name;

That should be faster, hope that helps !!!

Re: How do you speed up count(*) on tables in Hive

Explorer

@Geoffrey Shelton Okot it is also taking the same time. I run query select count(ayx) from test_table. It took 17 Sec. and while i run select count(1) from test_table, the time remains same.

Could you please explore more about hive query fast processing.

Thanks

Highlighted

Re: How do you speed up count(*) on tables in Hive

New Contributor

This is an old thread, but for anyone looking at it, count(1) is the same query as count(*), so there are no performance benefits to using one over the other.