Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

YARN - VCores max

Solved Go to solution
Highlighted

YARN - VCores max

Can yarn allocate more vcores (containers) than cluster VCores Total (sum of all node manager: yarn.nodemanager.resource.cpu-cores)?

In my tests only memory is limiting applications to get accepted and execute containers. I expected it be limited by both, Memory and Vcores available.

See screenshot below where "VCores used" is greater than "VCores total".

611-yarn-vcores.png

1 ACCEPTED SOLUTION

Accepted Solutions

Re: YARN - VCores max

Contributor

DefaultResourceCalculator only takes memory into account. Here is a brief explanation of what you are seeing (relevant part bolded).

Pluggable resource-vector in YARN scheduler

The CapacityScheduler has the concept of a ResourceCalculator – a pluggable layer that is used for carrying out the math of allocations by looking at all the identified resources. This includes utilities to help make the following decisions:

  • Does this node have enough resources of each resource-type to satisfy this request?
  • How many containers can I fit on this node, sorting a list of nodes with varying resources available.

There are two kinds of calculators currently available in YARN – the DefaultResourceCalculator and theDominantResourceCalculator.

The DefaultResourceCalculator only takes memory into account when doing its calculations. This is why CPU requirements are ignored when carrying out allocations in the CapacityScheduler by default. All the math of allocations is reduced to just examining the memory required by resource-requests and the memory available on the node that is being looked at during a specific scheduling-cycle.

You can find more on this topic on our blog:

managing-cpu-resources-in-your-hadoop-yarn-clusters

2 REPLIES 2

Re: YARN - VCores max

Contributor

DefaultResourceCalculator only takes memory into account. Here is a brief explanation of what you are seeing (relevant part bolded).

Pluggable resource-vector in YARN scheduler

The CapacityScheduler has the concept of a ResourceCalculator – a pluggable layer that is used for carrying out the math of allocations by looking at all the identified resources. This includes utilities to help make the following decisions:

  • Does this node have enough resources of each resource-type to satisfy this request?
  • How many containers can I fit on this node, sorting a list of nodes with varying resources available.

There are two kinds of calculators currently available in YARN – the DefaultResourceCalculator and theDominantResourceCalculator.

The DefaultResourceCalculator only takes memory into account when doing its calculations. This is why CPU requirements are ignored when carrying out allocations in the CapacityScheduler by default. All the math of allocations is reduced to just examining the memory required by resource-requests and the memory available on the node that is being looked at during a specific scheduling-cycle.

You can find more on this topic on our blog:

managing-cpu-resources-in-your-hadoop-yarn-clusters

Re: YARN - VCores max

Thanks @Shane Kumpf, it's been a while since I wanted to clarify it, totally clear now.

Have you seen people using: DominantResourceCalculator? This one makes much more sense to me.