Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Yarn memory allocation & utilization

avatar
Expert Contributor

I am running a cluster with 2 nodes where master & worker having below configuration.

Master : 8 Cores, 16GB RAM

Worker : 16 Cores, 64GB RAM

YARN configuration:

yarn.scheduler.minimum-allocation-mb: 1024
yarn.scheduler.maximum-allocation-mb: 22145
yarn.nodemanager.resource.cpu-vcores : 6
yarn.nodemanager.resource.memory-mb: 25145

Capacity Scheduler:

yarn.scheduler.capacity.default.minimum-user-limit-percent=100
yarn.scheduler.capacity.maximum-am-resource-percent=0.5
yarn.scheduler.capacity.maximum-applications=100
yarn.scheduler.capacity.node-locality-delay=40
yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.default.acl_administer_jobs=*
yarn.scheduler.capacity.root.default.acl_submit_applications=*
yarn.scheduler.capacity.root.default.capacity=100
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=1
yarn.scheduler.capacity.root.queues=default

We have 23 spark jobs(scheduled in oozie)running on YARN at every hour. Some jobs are taking more time to complete. I am not sure whether YARN memory + vcores allocation is done properly or not.

Please suggest me the recommended YARN memory, vcores & Scheduler configuration based on the number of cores + RAM availablity.

Thanks,

Sampath

1 ACCEPTED SOLUTION

avatar

It looks like you are only letting YARN use 25GB's of your worker nodes' 64GB as well as only 6 of your 16 CPU cores, so these values should be raised. Check out details at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/determ... for a script that can help you set some baseline values for these properties.

As for the Spark jobs. Interestingly enough, each of these jobs is requesting a certain size and number of containers and I'm betting each job is a bit different. Since Spark jobs get their resources first, it would seem normal that a specific job (as long as the resource request doesn't change nor does the fundamental dataset size for input) take a comparable time to run from invocation to invocation. Surely, that isn't necessarily the case from different Spark jobs which may be doing entirely different things.

Good luck and happy Hadooping/Sparking!

View solution in original post

3 REPLIES 3

avatar

It looks like you are only letting YARN use 25GB's of your worker nodes' 64GB as well as only 6 of your 16 CPU cores, so these values should be raised. Check out details at https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/determ... for a script that can help you set some baseline values for these properties.

As for the Spark jobs. Interestingly enough, each of these jobs is requesting a certain size and number of containers and I'm betting each job is a bit different. Since Spark jobs get their resources first, it would seem normal that a specific job (as long as the resource request doesn't change nor does the fundamental dataset size for input) take a comparable time to run from invocation to invocation. Surely, that isn't necessarily the case from different Spark jobs which may be doing entirely different things.

Good luck and happy Hadooping/Sparking!

avatar
Expert Contributor

Thanks for your inputs.

avatar
Cloudera Employee

Hi,

 

Hope this document will clarify your doubts. This was a tuning document.

https://blog.cloudera.com/how-to-tune-your-apache-spark-jobs-part-2/

 

Thanks

AK