05-13-2015 09:48 AM
I have reset the values for YARN's Java Heap Size of NodeManager and Java Heap Size of ResourceManager,via CM.
Then I restared the cluster.
<Q1> Under what file (xml, sh, map, py) do these parameters exist??
(When I look, after the cluster restarted, under the /etc/hadoop/conf.cloudera.yarn the only files got updated are topology.map & topology.py). Files such as core-site.xml. mapred-site.xml haven't changed!
[BTW, when I execute ps aux | grep resourcemanager (or ps aux | grep nodemanager) are -Xms and -Xmx that tell me the Java Heap Size? ]
<Q2> How do you determine what's the best value of these parameters (Java Heap Size)?
(Is it a trial-and-error scenario?? Any best practices to follow)?
05-13-2015 06:22 PM
The comment on the setting in CM should have explained it for you:
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
You can not set that in a configuration file since the JVM is started before the configuration file is read and you need to specify the heap size on startup. CM passes the value to the agent and the agent then creates the java cmd line. The only place they are stored is in CM.
Under normal circumstances in a small cluster, lets say 10 nodes, 2 GB should be enough. In an average size cluster, 50 nodes, a RM should not need more than 4GB. In a large cluster,hundred or more NM's, you need to increase that. For a NM you can normally leave that at 1GB on small nodes or 2-4GB on large nodes. The number of containers you can run on a node is the size of the node.
05-14-2015 04:39 AM
Thank you for the explanation!
BTW, where do you see the 'Comments'?? I don't have that!
<< The comment on the setting in CM should have explained it for you:
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.>>