Support Questions

vamsi123 · ‎12-30-2016

what is the order of execution for mapreduce Job? Is it correct and please correct me if i am wrong?

Mapper
partition each mapper output
sorting with in each partition based on key
grouping
shuffle and merge:each reducer will take one partition from all map tasks and merge together
combiner
reducer

manish1 · ‎12-30-2016

@vamsi valiveti

This is the right order:

View solution in original post

manish1 · ‎12-30-2016

@vamsi valiveti

This is the right order:

vamsi123 · ‎12-31-2016

a)could you please provide source for this link and it is really useful

b)what about these two in the diagram and where it will come?

grouping
shuffle and merge:each reducer will take one partition from all map tasks and merge together

manish1 · ‎12-31-2016

@vamsi valiveti

a> This slide is from Hortonworks Training course. The course/slides are available to paid customers only.

b>

i> there is nothing like grouping

ii> Shuffle happens when data move from Map to reduce (please see the diagram) and Merge happens during sort phase at Reducer side.

Cloudera Community

Support Questions

Map reduce flow clarification