Support Questions

Find answers, ask questions, and share your expertise

Yarn memory allocation & utilization

avatar
Expert Contributor

I am running a cluster with 2 nodes where master & worker having below configuration.

Master : 8 Cores, 16GB RAM

Worker : 16 Cores, 64GB RAM

YARN configuration:

yarn.scheduler.minimum-allocation-mb: 1024
yarn.scheduler.maximum-allocation-mb: 22145
yarn.nodemanager.resource.cpu-vcores : 6
yarn.nodemanager.resource.memory-mb: 25145

Capacity Scheduler:

yarn.scheduler.capacity.default.minimum-user-limit-percent=100
yarn.scheduler.capacity.maximum-am-resource-percent=0.5
yarn.scheduler.capacity.maximum-applications=100
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.default.acl_administer_jobs=*
yarn.scheduler.capacity.root.default.acl_submit_applications=*
yarn.scheduler.capacity.root.default.capacity=100
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=1
yarn.scheduler.capacity.root.queues=default

We have 23 spark jobs(scheduled in oozie)running on YARN at every hour. Some jobs are taking more time to complete. I am not sure whether YARN memory + vcores allocation is done properly or not.

Please suggest me the recommended YARN memory, vcores & Scheduler configuration based on the number of cores + RAM availablity.

Thanks,

Sampath

1 ACCEPTED SOLUTION

avatar

It looks like you are only letting YARN use 25GB's of your worker nodes' 64GB as well as only 6 of your 16 CPU cores, so these values should be raised. Check out details at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/determ... for a script that can help you set some baseline values for these properties.

As for the Spark jobs. Interestingly enough, each of these jobs is requesting a certain size and number of containers and I'm betting each job is a bit different. Since Spark jobs get their resources first, it would seem normal that a specific job (as long as the resource request doesn't change nor does the fundamental dataset size for input) take a comparable time to run from invocation to invocation. Surely, that isn't necessarily the case from different Spark jobs which may be doing entirely different things.

Good luck and happy Hadooping/Sparking!

View solution in original post

3 REPLIES 3

avatar

It looks like you are only letting YARN use 25GB's of your worker nodes' 64GB as well as only 6 of your 16 CPU cores, so these values should be raised. Check out details at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/determ... for a script that can help you set some baseline values for these properties.

As for the Spark jobs. Interestingly enough, each of these jobs is requesting a certain size and number of containers and I'm betting each job is a bit different. Since Spark jobs get their resources first, it would seem normal that a specific job (as long as the resource request doesn't change nor does the fundamental dataset size for input) take a comparable time to run from invocation to invocation. Surely, that isn't necessarily the case from different Spark jobs which may be doing entirely different things.

Good luck and happy Hadooping/Sparking!

avatar
Expert Contributor

Thanks for your inputs.

avatar
Cloudera Employee

Hi,

 

Hope this document will clarify your doubts. This was a tuning document.

https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/

 

Thanks

AK