Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Map reduce flow clarification

Solved Go to solution

Map reduce flow clarification

Contributor

what is the order of execution for mapreduce Job? Is it correct and please correct me if i am wrong?

Mapper
partition each mapper output
sorting with in each partition based on key
grouping
shuffle and merge:each reducer will take one partition from all map tasks and merge together
combiner
reducer
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Map reduce flow clarification

Expert Contributor

@vamsi valiveti

This is the right order:

10950-screen-shot-2016-12-30-at-11152-pm.png

3 REPLIES 3

Re: Map reduce flow clarification

Expert Contributor

@vamsi valiveti

This is the right order:

10950-screen-shot-2016-12-30-at-11152-pm.png

Highlighted

Re: Map reduce flow clarification

Contributor

a)could you please provide source for this link and it is really useful

b)what about these two in the diagram and where it will come?

  1. grouping
  2. shuffle and merge:each reducer will take one partition from all map tasks and merge together

Re: Map reduce flow clarification

Expert Contributor

@vamsi valiveti

a> This slide is from Hortonworks Training course. The course/slides are available to paid customers only.

b>

i> there is nothing like grouping

ii> Shuffle happens when data move from Map to reduce (please see the diagram) and Merge happens during sort phase at Reducer side.

Don't have an account?
Coming from Hortonworks? Activate your account here