I am running
into an issue when running a mapreduce job and the Hadoop env is running out of memory. It
appears the way yarn is configured is the values are higher than the
recommended setting per Hortonworks and allocated a large amount of RAM when
the containers are created. My data input file is over 600 million rows. It appears the containers and reserved sys memory should be reduced.
We ran this script to identify the resources allocated and
the return values are shown below:
Memory per container is 4 times the recommended value and the
reserved system memory is set to twice the value per recommended value.
Can someone confirm if I change the reserved system memory and the container in the yarn-site.xml file do I need to change any additional values? Hadoop cluster consists of 24 cores per node, 256
GB RAM (per node) and 12 disks (per node)