Support Questions

Find answers, ask questions, and share your expertise

How is "yarn.nodemanager.resource.cpu-vcores" value determined?

avatar

Hello guys,

I have SUSE11 SP4 machine where I have installed and configured HDP2.3, YARN, MapReduce, etc. I have done no modifications during the installation using the Ambari UI - just clicking next. I am using Amazon image with m4.xlarge size which means 4 vCPU and 16 GiB Memory. The YARN version is 2.7.1.2.3.

When I open the

/etc/hadoop/2.3.4.7-4/0/yarn-site.xml

I see the following entries there:

<property>
      <name>yarn.nodemanager.resource.cpu-vcores</name>
      <value>3</value>
</property>     

<property>
      <name>yarn.scheduler.minimum-allocation-vcores</name>
      <value>1</value>
</property>

<property>
      <name>yarn.scheduler.maximum-allocation-vcores</name>
      <value>3</value>
</property>

My question is how the "yarn.nodemanager.resource.cpu-vcores" value determined? I thought that this value is relevant to the vCPUs which in this case are 4.

1 ACCEPTED SOLUTION

avatar

@Elitsa Milanova yarn.nodemanager.resource.cpu-vcores are by default ~80% of total vCPUs available on the machine. Ambari internal script picks this default config based on this calculation AFAIK. But it may not always be the best practice depending on what other non-yarn components you are running on the machine, OS requirements etc. These default configs are starting point for you and will need tunings/changes depending on use case, workload requirements, cluster/host(s) specifications.

View solution in original post

4 REPLIES 4

avatar
Super Guru

avatar

Hi @Sagar Shimpi,

Thank you for the posts.

I get the idea how to configure the values but what I still do not understand is how come when I use c4.xlarge, for example, the values are set to 3 i.e. and when I use m3.2xlarge i.e. the values are set to 1, keeping in mind that there are no explicit configuration I made and the configurations does not differ between the two hosts. How this default value is set? If this is default, why not 1 everytime? 🙂

avatar

@Elitsa Milanova yarn.nodemanager.resource.cpu-vcores are by default ~80% of total vCPUs available on the machine. Ambari internal script picks this default config based on this calculation AFAIK. But it may not always be the best practice depending on what other non-yarn components you are running on the machine, OS requirements etc. These default configs are starting point for you and will need tunings/changes depending on use case, workload requirements, cluster/host(s) specifications.

avatar

Thank you, @Pardeep and @Sagar Shimpi!

Finally, from the articles above and from your replies I have managed to get a short summary. I am posting this in case somebody else wonders about this 🙂

"In order to handle the variety of workloads related with intense CPU usage, YARN has introduced a new concept called "vcores" (short for virtual cores). A vcore, is a usage share of a host CPU which YARN Node Manager allocates to use all available resources in the most efficient possible way. YARN hosts can be tuned to optimize the use of vcores by configuring the available YARN containers as the number of vcores has to be set by an administrator in yarn-site.xml on each node. The decision of how much it should be set to is driven by the type of workloads running in the cluster and the type of hardware available. The general recommendation is to set it to the number of physical cores on the node, but administrators can bump it up if they wish to run additional containers on nodes with faster CPUs. In order to enable CPU scheduling, there are some configuration properties that administrators and users need to be aware of:
  • yarn.nodemanager.resource.cpu-vcores: Set to the appropriate number in yarn-site.xml on all the nodes. This is strictly dependent on the type of workloads running in a cluster, but the general recommendation is that admins set it to be equal to the number of physical cores on the machine.
  • yarn.scheduler.minimum-allocation-vcores: This is the minimum allocation for every container request at the Resource Manager, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.
  • yarn.scheduler.maximum-allocation-vcores: This is the maximum allocation for every container request at the Resource Manager, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.
“yarn.scheduler.maximum-allocation-vcores” controls the maximum vcores that any submitted job can request. “yarn.nodemanager.resource.cpu-vcores” controls how many vcores can be scheduled on a particular NodeManager instance. So “yarn.nodemanager.resource.cpu-vcores” can vary from host to host (NodeManager to NodeManager), while “yarn.scheduler.maximum-allocation-vcores” is a global property of the scheduler."

Further information can be taken also from here: https://community.cloudera.com/t5/Cloudera-Manager-Installation/yarn-nodemanager-resource-cpu-vcores...