what is the order of execution for mapreduce Job? Is it correct and please correct me if i am wrong?
Mapper partition each mapper output sorting with in each partition based on key grouping shuffle and merge:each reducer will take one partition from all map tasks and merge together combiner reducer
a)could you please provide source for this link and it is really useful
b)what about these two in the diagram and where it will come?
a> This slide is from Hortonworks Training course. The course/slides are available to paid customers only.
i> there is nothing like grouping
ii> Shuffle happens when data move from Map to reduce (please see the diagram) and Merge happens during sort phase at Reducer side.