Created 05-02-2016 01:36 PM
Created 05-02-2016 01:52 PM
It depends on the memory you have on your cluster.
You have an amount of RAM allocated to YARN (yarn.nodemanager.resource.memory-mb) on a node, then each mapper has a size (in MR it's mapreduce.map.memory.mb), that gives you an idea (you'll also use memory for ApplicationMasters, Reducers,etc.)
Created 05-02-2016 01:44 PM
If I am not wrong its not very simple to answer the question. No of mappers depends on your compute power [CPU and memory] and also on the no of containers [when using yarn].
1JVM corresponds to 1 Mapper usually.
Depending upon your compute you need to configure the MR memory settings so hat you can use MAXresources [ie. mappers and reducers]
pls refer below link for -
http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/
Created 05-02-2016 01:51 PM
I don't think there is any max number Most likely some theoretical int number that you wouldn't reach in any cluster. Or do you mean at the same time? In that case it would be
(RAM * Nodenumber) / *yarn.scheduler.minimum-allocation-mb
So if you have a 100 nodes and your nodes have 96GB yarn memory and your minimum allocation size ( and your map size ) is 2GB it would be 4800.
Created 05-02-2016 01:52 PM
It depends on the memory you have on your cluster.
You have an amount of RAM allocated to YARN (yarn.nodemanager.resource.memory-mb) on a node, then each mapper has a size (in MR it's mapreduce.map.memory.mb), that gives you an idea (you'll also use memory for ApplicationMasters, Reducers,etc.)
Created 05-02-2016 11:02 PM
As others explained it depends on number of containers available on your cluster.