Support Questions

Find answers, ask questions, and share your expertise

Is there any specific reason Ambari sets 64MB for "hive.exec.reducers.bytes.per.reducer"?

avatar

Hive's default value for "hive.exec.reducers.bytes.per.reducer" is 256MB (which used to be 1GB)

Looks like this was introduced in AMBARI-8092 and AMBARI-8836.

I was wondering if there is any specific reason Ambari sets 64MB?

In general, is it good thing to have many reducers?

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Hajime

This makes sense

594-screen-shot-2015-11-26-at-53215-am.png

hive.exec.reducers.bytes.per.reducer

  • Default Value: 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Size per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

Point to note:

Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

View solution in original post

1 REPLY 1

avatar
Master Mentor

@Hajime

This makes sense

594-screen-shot-2015-11-26-at-53215-am.png

hive.exec.reducers.bytes.per.reducer

  • Default Value: 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Size per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

Point to note:

Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties