Support Questions

sharmadukool136 · ‎01-30-2019

How can I change / configure number of Mappers ?

shobikas · ‎01-30-2019

The number of map tasks for a given job is driven by the number of input splits. So, the number of map tasks is equal to the number of input splits. Split is logical split of the data, basically used during data processing using MapReduce program.

Suppose you have a file of 200MB and HDFS default block configuration is 128MB.Then it will consider two splits.
But if you have specified the split size(say 200MB) in your MapReduce program then both blocks(2 block) will be considered as a single split for the MapReduce processing and one Mapper will get assigned for this job.

If you want n number of Map, divide the file size by n as follows:
Parameters:
conf.set(“mapred.max.split.size”, “41943040”); // maximum split file size in bytes

conf.set(“mapred.min.split.size”, “20971520”); // minimum split file size in bytes.

Please accept my answer if it is found helpful.

patelharshali13 · ‎01-31-2019

Number of mappers always equals to the Number of input splits. We can control the number of splits by changing the mapred.min.split.size which controls the minimum input split size.

Assume the block size is 64 MB and mapred.min.split.size is set to 128 MB.
The size of InputSplit will be 128 MB even though the block size is 64 MB.

Shelton · ‎02-01-2019

@Dukool SHarma

Yes ,you can when executing from the command line by adding the -Dmapreduce parameter see below

bin/hadoop jar -Dmapreduce.job.maps=5 yourapp.jar ...

HTH

Shelton · ‎02-10-2019

@Dukool SHarma

Any updates?

Cloudera Community

Support Questions

How to change / configure number of Mappers ?