Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

select count query taking more time in hive/tez

Highlighted

select count query taking more time in hive/tez

New Contributor

when I run a hive query

select count (*) from mytable

it takes a lot of time, for a table of 27 million line, it runs in 30 minutes, i use HDP 2.6.4 with hive 1.2.1000 and tez 0.7.0.


Best regards

3 REPLIES 3

Re: select count query taking more time in hive/tez

Super Guru

@Abderrahim BOUDI


Good articles regards to tune Hive performance: Hive_performance_tune Tez_Performance_Tune . ExplainPlan

This is too broad question to answer, here are my thoughts:

1.Check is your HiveJob is getting started running in Resource manager(not in queue waiting for resources i.e Accepted state..etc)

2.Check in HDFS how many files are there in the table pointed directory, too many small files will result poor performance. and you need to consolidate all small files into big one's then run the query again.

3.Try running hive console in debug mode to see where the job is taking time to execute.

4.Check is there any skew's in the data and create table stating all these skewed columns in the table properties.

Re: select count query taking more time in hive/tez

New Contributor
@Shu 

By default, hive.exec.reducers.bytes.per.reducer is set to 64MB, as I hadoop 2.7 should I put to 128MB or put 256MB as indicated in the documentation that you had communicated to me?

Re: select count query taking more time in hive/tez

New Contributor
@Shu  By default, hive.exec.reducers.bytes.per.reducer is set to 64MB, as I hadoop 2.7 should I put to 128MB or put 256MB as indicated in the documentation that you had communicated to me?
Don't have an account?
Coming from Hortonworks? Activate your account here