Created 11-26-2015 06:46 AM
Hive's default value for "hive.exec.reducers.bytes.per.reducer" is 256MB (which used to be 1GB)
Looks like this was introduced in AMBARI-8092 and AMBARI-8836.
I was wondering if there is any specific reason Ambari sets 64MB?
In general, is it good thing to have many reducers?
Created on 11-26-2015 10:33 AM - edited 08-19-2019 05:45 AM
This makes sense
hive.exec.reducers.bytes.per.reducer
1,000,000,000
prior to Hive 0.14.0; 256 MB (256,000,000
) in Hive 0.14.0 and laterSize per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.
Point to note:
Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
Created on 11-26-2015 10:33 AM - edited 08-19-2019 05:45 AM
This makes sense
hive.exec.reducers.bytes.per.reducer
1,000,000,000
prior to Hive 0.14.0; 256 MB (256,000,000
) in Hive 0.14.0 and laterSize per reducer. The default in Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.
Point to note:
Calculate hive.exec.reducers.max should be set to a number which is less than the available reduce slots on the cluster. Hive calculate the reducers based on hive.exec.reducers.bytes.per.reducer (default 1GB). Consider setting this high based on the workloads and demand for the reducers on the cluster
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties