Support Questions
Find answers, ask questions, and share your expertise

What is the main constraint on running larger YARN jobs and how do I increase it?

What is the main constraint on running larger YARN jobs and how do I increase it?

Expert Contributor

What is the main constraint on running larger YARN jobs (Hadoop version HDP-3.1.0.0 (3.1.0.0-78)) and how do I increase it? Basically, want to do more (all of which are pretty large) sqoop jobs concurrently.

 

I am currently assuming that I need to increase the Resource Manager heap size (since that is what I see going up on the Ambari dashboard when I run YARN jobs). How to add more resources to RM heap / why does RM heap appear to be such a small fraction of total RAM available (to YARN?) across the cluster?

 

Looking in Ambari: YARN cluster memory is 55GB, but RM heap is only 900MB. 

<since the site is showing my posted image, a copy can be seen here: https://stackoverflow.com/q/64627521/8236733>

 

Could anyone with more experience tell me what is the difference and which is the limiting factor in running more YARN applications (and again, how do I increase it)? Anything else that I should be looking at? Any docs explaining this in more detail?