When I run my query, it will demand to the Resources Manager :
- an Application Master creation (including the DAG) (??? GB)
- a node manager creation for the Map task (2x2GB)
- a node manager creation for the Reduce task (4x2GB)
If the cluster is busy (not enought memory available), with the FIFO scheduler, my query will be placed in queue, waiting for the freeing of resources (1 container for the AM, 1 container of 4GB for the map task and 1 container of 8GB for the reduce task).
First of all, am I right ?
With the fair scheduler, I read "With the Fair Scheduler, there is no need to reserve a set amount of capacity, since it will dynamically balance resources between all running jobs."
That will say that I will not wait to get the 3 enought sized containers and then, that I will get (for example) a container of 2GB for the Map task and a container of 2GB for the reduce task IF the average container size is 2GB ? In this case, increasing mapreduce.map|reduce.memory.mb is not useful ?