Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how does the YARN fair scheduler give resources to a TEZ application ?

Highlighted

how does the YARN fair scheduler give resources to a TEZ application ?

New Contributor

Hi,

I'm a little bit confused with YARN fair scheduler, I'll be glad somebody helps me :-).

Let's say that I run a TEZ application (an HIVE query for example implying a simple Map/Reduce) with the following parameters :

yarn.scheduler.maximum-allocation-vcores=16
yarn.scheduler.minimum-allocation-mb=2048
mapreduce.map.memory.mb=4096
mmapreduce.reduce.memory.mb=8192

When I run my query, it will demand to the Resources Manager :

- an Application Master creation (including the DAG) (??? GB)

- a node manager creation for the Map task (2x2GB)

- a node manager creation for the Reduce task (4x2GB)

1)

If the cluster is busy (not enought memory available), with the FIFO scheduler, my query will be placed in queue, waiting for the freeing of resources (1 container for the AM, 1 container of 4GB for the map task and 1 container of 8GB for the reduce task).

First of all, am I right ?

2)

With the fair scheduler, I read "With the Fair Scheduler, there is no need to reserve a set amount of
capacity, since it will dynamically balance resources between all running jobs.
"

That will say that I will not wait to get the 3 enought sized containers and then, that I will get (for example) a container of 2GB for the Map task and a container of 2GB for the reduce task IF the average container size is 2GB ? In this case, increasing mapreduce.map|reduce.memory.mb is not useful ?

Is that so ?

Thanks a lot to help me to see clearer :-)