Support Questions

Find answers, ask questions, and share your expertise

How to change / configure number of Mappers ?

avatar
Rising Star

How can I change / configure number of Mappers ?

4 REPLIES 4

avatar
Expert Contributor

Hi @Dukool SHarma

The number of map tasks for a given job is driven by the number of input splits. So, the number of map tasks is equal to the number of input splits. Split is logical split of the data, basically used during data processing using MapReduce program.

Suppose you have a file of 200MB and HDFS default block configuration is 128MB.Then it will consider two splits.
But if you have specified the split size(say 200MB) in your MapReduce program then both blocks(2 block) will be considered as a single split for the MapReduce processing and one Mapper will get assigned for this job.

If you want n number of Map, divide the file size by n as follows:
Parameters:
conf.set(“mapred.max.split.size”, “41943040”); // maximum split file size in bytes

conf.set(“mapred.min.split.size”, “20971520”); // minimum split file size in bytes.

Please accept my answer if it is found helpful.

avatar
Rising Star

Number of mappers always equals to the Number of input splits. We can control the number of splits by changing the mapred.min.split.size which controls the minimum input split size.

Assume the block size is 64 MB and mapred.min.split.size is set to 128 MB.
The size of InputSplit will be 128 MB even though the block size is 64 MB.

avatar
Master Mentor

@Dukool SHarma

Yes ,you can when executing from the command line by adding the -Dmapreduce parameter see below

bin/hadoop jar -Dmapreduce.job.maps=5 yourapp.jar ...

HTH

avatar
Master Mentor

@Dukool SHarma

Any updates?