Support Questions

Find answers, ask questions, and share your expertise

Is there any specific reason Ambari sets 64MB for "hive.exec.reducers.bytes.per.reducer"?

Hive's default value for "hive.exec.reducers.bytes.per.reducer" is 256MB (which used to be 1GB)

Looks like this was introduced in AMBARI-8092 and AMBARI-8836.

I was wondering if there is any specific reason Ambari sets 64MB?

In general, is it good thing to have many reducers?

1 ACCEPTED SOLUTION

@Hajime

This makes sense

594-screen-shot-2015-11-26-at-53215-am.png

hive.exec.reducers.bytes.per.reducer

  • Default Value: 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Size per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

Point to note:

Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

View solution in original post

1 REPLY 1

@Hajime

This makes sense

594-screen-shot-2015-11-26-at-53215-am.png

hive.exec.reducers.bytes.per.reducer

  • Default Value: 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Size per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

Point to note:

Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster

https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.