09-11-2016 08:36 AM
Sometimes I face issues like Hive/Mapreduce job getting stuck or not starting.
I use to tune some properties like mapreduce.map.memory-mb and same for reducer And mapreduce.map.cpu-cores and same for reducer. Sometimes i need to check stoarge space and other issues.
But I never change container size(yarn.scheduler.minimum-allocation-mb and yarn.scheduler.increment-allocation-mb)
Do container size change according to the job resource requirement?
I want to know when, in which cases, and How do we use these two properties?
09-18-2016 05:53 PM
You do not use either of these directly. The minimum settings are enforced by the scheduler: you can not request a container smaller than the minimum and the scheduler will round up anything smaller up to the minimum and give you the minimum container size.
The increment is used for rounding up the container sizes. It makes the internal house keeping for the scheduler simpler. Neither setting influences the job settings directly, they are applied on top of the settings. You still need to set the resources that are needed for a job as part of the job configuration.
1) request a container 600MB/1vcore: minimum size of a container is 1GB -> a 1GB/1vcore container is allocated
2) request a container 1200MB/1vcore: minimum size is 1GB, increment is 500MB -> a container of 1.5GB (rounded up to the next increment, the minimum is used as a base)
Same would happen for vcores.
For more info see Untangling Yarn blog series