Support Questions

zack_riesland · ‎04-04-2016

I'm trying to tune our cluster to optimize performance.

Currently, we still have default values for hive.exec.reducers.bytes.per.reducer and hive.exec.reducers.max.

According to the documentation, in Hive 0.13, hive.exec.reducers.bytes.per.reducer should default to 256mb, but Ambari (our HDP stack is 2.2.8) appears to be defaulting this to 64mb. But on Hive 0.14, the default is the all the way up to 1GB.

And then for hive.exec.reducers.max, the HDP default is 1,009.

I'm trying to understand how best to set these values. It seems like there is a relationship between these values, the cluster specs, and also the YARN settings, and I'm trying to understand the relationship.

For hive.exec.reducers.max, I would think it should be a multiple of: number data nodes x number of CPUs per node. So for a cluster with 10 data nodes and 16 CPUs per nodes, it would probably be a multiple of 160. Right? Maybe 320 or 480?

hive.exec.reducers.bytes.per.reducer is a bit more mysterious. The default went up by a factor of 20 between 0.13 and 0.14. Why?

And then how does this all relate to YARN container sizes?

Any thoughts?

amcbarnett · ‎04-04-2016

Please take a look at these articles:

https://community.hortonworks.com/content/kbentry/22419/hive-on-tez-performance-tuning-determining-r... and

https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

View solution in original post

amcbarnett · ‎04-04-2016

Please take a look at these articles:

https://community.hortonworks.com/content/kbentry/22419/hive-on-tez-performance-tuning-determining-r... and

https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

aahaselgrove · ‎03-08-2017

I would be interested in why hive.exec.reducers.max defaults to 1009, and what influences an appropriate choice for this setting. I couldn't find any detail on it in either of the linked articles.

Cloudera Community

Support Questions

Guidance for setting hive.exec.reducers.bytes.per.reducer and hive.exec.reducers.max

Guidance around setting right number for 'hive.exe...

How to set a processor to DEBUG when on Cloudera D...

Swappiness setting recommendation

Is there any specific reason Ambari sets 64MB for ...

Re: Hive on Tez Performance Tuning - Determining R...

HDF on Azure - guidance

Need Guidance to move services from CM Server to o...

Setting Up a Secure Apache NiFi Registry

HDFS Settings for Better Hadoop Performance

NIFI 2.0 Cluster Set Up