Support Questions

Find answers, ask questions, and share your expertise

How AM decides how many mapreduce jobs wiill be created for particular execution ? and in which memory the map task and reduce task will be performed ?

avatar
Expert Contributor

i want to know internal representation of YARN and MAPREDUCE. i am new to hadoop. i am getting how exactly the jobs get executed.

1 ACCEPTED SOLUTION

avatar

Hi @heta desai

The Application Master will launch one MapTask for each map split. Typically, there is a map split for each input file. If the input file is too big (bigger than the HDFS block size) then we have two or more map splits associated to the same input file.

Also the memory used fro map and reduce task is RAM of Nodemanagers.

Please refer to it for more details -

http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html

View solution in original post

3 REPLIES 3

avatar

Hi @heta desai

The Application Master will launch one MapTask for each map split. Typically, there is a map split for each input file. If the input file is too big (bigger than the HDFS block size) then we have two or more map splits associated to the same input file.

Also the memory used fro map and reduce task is RAM of Nodemanagers.

Please refer to it for more details -

http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html

avatar
Expert Contributor

Thank you.

avatar

Can you please accept my answer if it answered your question ? 🙂