I am running a cluster with 15 data node, 15 region server and
16 node manager (of course name node, Secondary name node, Hactive master, Resource
manager). All the machines are m3.large type machine basically so, 2 core processor
and 7.5GB of RAM.
By default it allocates 32GB for the yarn memory and 1vcore.
Here my default configuration and it uses DefaultResourceCalculator.
when I run a mapreduce job it takes about some 30min to
complete it till the time the yarn memory utilization was high, I thought that
the yarn memory was the issue. So I have doubled the size as below.
Now, yarn memory increased
from 32Gb to 64GB, but when I run a same mapreduce job with newer configuration
it takes me around 42 min though yarn memory all the 64GB the cluster seems
slower than before. So, I would like to understand the containers resource
allocation and why it’s slow down after I increased the memory also I would
like to see how many containers per cluster and per node (any calculation). Please
suggest me with the recommended configuration in this case.