Support Questions

mokkan · ‎12-01-2021

I have seen some of my jobs are using only one reducer in the end. Does it mean we have one output file? Or it depends?

ChethanYM · ‎12-01-2021

Hi,

The Reducer process the output of the mapper. After processing the data, it produces a new set of output. At last HDFS stores this output data. Reducer takes a set of an intermediate key value pair produced by the mapper as the input and runs a Reducer function on each of them.

The mappers and reducers depends on the data that is processing.

You can manually set the number of reducers with below property but i think it is not recommended.

set mapred.reduce.tasks=xx;

Regards,

Chethan YM

Cloudera Community

Support Questions

Reducer 1 question