Created 12-01-2021 07:01 AM
I have seen some of my jobs are using only one reducer in the end. Does it mean we have one output file? Or it depends?
Created 12-01-2021 08:41 AM
Hi,
The Reducer process the output of the mapper. After processing the data, it produces a new set of output. At last HDFS stores this output data. Reducer takes a set of an intermediate key value pair produced by the mapper as the input and runs a Reducer function on each of them.
The mappers and reducers depends on the data that is processing.
You can manually set the number of reducers with below property but i think it is not recommended.
set mapred.reduce.tasks=xx;
Regards,
Chethan YM