Support Questions

Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

YARN memory configuration parameters and Java Heap Space.


I'm trying to configure the optimum memory configuration in YARN to implement some MR tasks in R. For the moment, I have a single node with around 40GB RAM available. I have tried different memory combinations but all of them result in Java Heap Space exceptions when trying to execute a simple MR R code (using the library plyrmr) to process a small (a few KB size) text file. The relevant memory configuration parameters I have so far (in yarn-site.xml and map-red.xml) are:

yarn.scheduler.maximum-allocation-mb = 24576
yarn.scheduler.minimum-allocation-mb = 3076 = 3076 = -Xmx2457m = 3072 = -Xmx4915m
mapreduce.reduce.memory.mb = 6144

Is there any other memory configuration parameter that needs to be set or adjusted? After launching the task, 2 split jobs are created and a Java Heap Space exception is raised. Looking through the YARN logs of the application that raises the exception, I stumple upon the following line after executing the

exec /bin/bash -c "$JAVA_HOME/bin/java -server -XX:NewRatio=8 -Dhdp.version= -Xmx400M 

What are these "400 MB" of Java Space for? I have checked a lot of different configuration files but I couldn't find any parameter related to these 400MB of space. Is there any other Java parameter that needs to be set in the aforementioned list of configuration properties?

The relevant log part of the MR task is:

INFO mapreduce.Job: Counters: 17
	Job Counters 
		Failed map tasks=7
		Killed map tasks=1
		Killed reduce tasks=1
		Launched map tasks=8
		Other local map tasks=6
		Data-local map tasks=2
		Total time spent by all maps in occupied slots (ms)=37110
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=37110
		Total time spent by all reduce tasks (ms)=0
		Total vcore-seconds taken by all map tasks=37110
		Total vcore-seconds taken by all reduce tasks=0
		Total megabyte-seconds taken by all map tasks=114001920
		Total megabyte-seconds taken by all reduce tasks=0
	Map-Reduce Framework
		CPU time spent (ms)=0
		Physical memory (bytes) snapshot=0
		Virtual memory (bytes) snapshot=0

Is there anything that I'm missing?

Thanks a lot for your time.


Hi @Jaime

I think it is the namenode total java max heap size.

Please go through this settings (You have to change it from HDFS config):

Please let us if this fixes the problem.


Hi @rbiswas,

Thanks for your comment. I didn't know how to adjust that memory parameter. However, looking in I discovered that it was set at 1024MB:


Unfortunately, that didn't solve the problem.

Hi again @rbiswas,

Up to my knowledge, each time a mapper (or reducer) is created, the ApplicationMaster will request the NodeManager to allocate a new Container with (and mapreduce.reduce.memory.mb) MBytes available. So, with my specific configuration, if three mappers are created, then, YARN will try to create three containers with 3072 MB each. Am I right?

If so, what if YARN can't reserve (3*3072MB)? Will it raise a Java Heap Space Exception?

Thanks in advance.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.