I've read conflicting advice for the correct value of "Default Number of Reduce Tasks per Job" (mapreduce.job.reduces) parameter in Yarn?
I think the best answer to this question is the following, by Allen Wittenauer from LinkedIn:
At LinkedIn (company), I tend to tell users that their ideal reducers should be the optimal value that gets them closest to:
A multiple of the block size
A task time between 5 and 15 minutes
Creates the fewest files possible