06-11-2015 02:08 PM
I've read conflicting advice for the correct value of "Default Number of Reduce Tasks per Job" (mapreduce.job.reduces) parameter in Yarn?
06-11-2015 03:06 PM
I think the best answer to this question is the following, by Allen Wittenauer from LinkedIn:
At LinkedIn (company), I tend to tell users that their ideal reducers should be the optimal value that gets them closest to: A multiple of the block size A task time between 5 and 15 minutes Creates the fewest files possible