Hi @Dukool SHarma
The number of map tasks for a given job is driven by the number of input splits. So, the number of map tasks is equal to the number of input splits. Split is logical split of the data, basically used during data processing using MapReduce program.
Suppose you have a file of 200MB and HDFS default block configuration is 128MB.Then it will consider two splits.
But if you have specified the split size(say 200MB) in your MapReduce program then both blocks(2 block) will be considered as a single split for the MapReduce processing and one Mapper will get assigned for this job.
If you want n number of Map, divide the file size by n as follows:
conf.set(“mapred.max.split.size”, “41943040”); // maximum split file size in bytes
conf.set(“mapred.min.split.size”, “20971520”); // minimum split file size in bytes.
Please accept my answer if it is found helpful.