Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How AM decides how many mapreduce jobs wiill be created for particular execution ? and in which memory the map task and reduce task will be performed ?

avatar
Expert Contributor

i want to know internal representation of YARN and MAPREDUCE. i am new to hadoop. i am getting how exactly the jobs get executed.

1 ACCEPTED SOLUTION

avatar

Hi @heta desai

The Application Master will launch one MapTask for each map split. Typically, there is a map split for each input file. If the input file is too big (bigger than the HDFS block size) then we have two or more map splits associated to the same input file.

Also the memory used fro map and reduce task is RAM of Nodemanagers.

Please refer to it for more details -

http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html

View solution in original post

3 REPLIES 3

avatar

Hi @heta desai

The Application Master will launch one MapTask for each map split. Typically, there is a map split for each input file. If the input file is too big (bigger than the HDFS block size) then we have two or more map splits associated to the same input file.

Also the memory used fro map and reduce task is RAM of Nodemanagers.

Please refer to it for more details -

http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html

avatar
Expert Contributor

Thank you.

avatar

Can you please accept my answer if it answered your question ? 🙂