How to determine number of mapper w/o knowing the ...
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.
The number of mappers should depend on how many disks your data is spread across, so IMO it's more a question of how big your cluster is. The block size and file size (and the replication factor) determine how many blocks there are, but it's really how many tasks can concurrently access blocks that you should be asking yourself.