Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

YARN - VCores max

avatar

Can yarn allocate more vcores (containers) than cluster VCores Total (sum of all node manager: yarn.nodemanager.resource.cpu-cores)?

In my tests only memory is limiting applications to get accepted and execute containers. I expected it be limited by both, Memory and Vcores available.

See screenshot below where "VCores used" is greater than "VCores total".

611-yarn-vcores.png

1 ACCEPTED SOLUTION

avatar
Rising Star

DefaultResourceCalculator only takes memory into account. Here is a brief explanation of what you are seeing (relevant part bolded).

Pluggable resource-vector in YARN scheduler

The CapacityScheduler has the concept of a ResourceCalculator – a pluggable layer that is used for carrying out the math of allocations by looking at all the identified resources. This includes utilities to help make the following decisions:

  • Does this node have enough resources of each resource-type to satisfy this request?
  • How many containers can I fit on this node, sorting a list of nodes with varying resources available.

There are two kinds of calculators currently available in YARN – the DefaultResourceCalculator and theDominantResourceCalculator.

The DefaultResourceCalculator only takes memory into account when doing its calculations. This is why CPU requirements are ignored when carrying out allocations in the CapacityScheduler by default. All the math of allocations is reduced to just examining the memory required by resource-requests and the memory available on the node that is being looked at during a specific scheduling-cycle.

You can find more on this topic on our blog:

managing-cpu-resources-in-your-hadoop-yarn-clusters

View solution in original post

2 REPLIES 2

avatar
Rising Star

DefaultResourceCalculator only takes memory into account. Here is a brief explanation of what you are seeing (relevant part bolded).

Pluggable resource-vector in YARN scheduler

The CapacityScheduler has the concept of a ResourceCalculator – a pluggable layer that is used for carrying out the math of allocations by looking at all the identified resources. This includes utilities to help make the following decisions:

  • Does this node have enough resources of each resource-type to satisfy this request?
  • How many containers can I fit on this node, sorting a list of nodes with varying resources available.

There are two kinds of calculators currently available in YARN – the DefaultResourceCalculator and theDominantResourceCalculator.

The DefaultResourceCalculator only takes memory into account when doing its calculations. This is why CPU requirements are ignored when carrying out allocations in the CapacityScheduler by default. All the math of allocations is reduced to just examining the memory required by resource-requests and the memory available on the node that is being looked at during a specific scheduling-cycle.

You can find more on this topic on our blog:

managing-cpu-resources-in-your-hadoop-yarn-clusters

avatar

Thanks @Shane Kumpf, it's been a while since I wanted to clarify it, totally clear now.

Have you seen people using: DominantResourceCalculator? This one makes much more sense to me.