09-20-2015 02:55 AM
How to determine number of mapper w/o knowing the block size. Only we have 2 input files , but dont know the exact size of the input file.
Can anyone tell me can we determine the same?
09-24-2015 05:50 AM
The number of mappers should depend on how many disks your data is spread across, so IMO it's more a question of how big your cluster is. The block size and file size (and the replication factor) determine how many blocks there are, but it's really how many tasks can concurrently access blocks that you should be asking yourself.