Support Questions
Find answers, ask questions, and share your expertise

Efficient Memory management for Running MapReduce Jobs

Efficient Memory management for Running MapReduce Jobs


We are trying to set up a single node cloudera hadoop cluster having 16 GB as RAM on linux machine. We are setting up 5.4.2 version.

Now when we check statistics post the installation and run the top command we find that only 1 -2 GB is available. when we trigger map reduce sample job - no memory is allocated to the job and so job doesnt run.


Can you please let us know what should we do so that more memory is available


Analysis of the top command

cloudera takes 4-5 GB.

mysql 6 GB.(external database to store the metastore) other services 2-3 GB. thus contributing to 13 GB out of 16


Re: Efficient Memory management for Running MapReduce Jobs

Master Guru
You're likely facing this because of a low set value for the CM -> YARN -> Configuration -> "yarn.nodemanager.resource.memory-mb".

CM may set this lower during its auto-configuration, while accounting for all the other roles and their required heap spaces on the same host. However, not all roles use up their entire configured heap size at all times, so the yarn.nodemanager.resource.memory property can be re-raised to a good value (8 GB or higher) to allow for proper request allocations from jobs.

This problem does not occur on a distributed installation, and is exclusive to single node installations.