Support Questions

Find answers, ask questions, and share your expertise

Not able to make Yarn dynamically allocate resources for Spark


Hi everyone,

have a cluster managed with Yarn and runs Spark jobs, the components were installed using Ambari ( I have 6 hosts each with 6 cores. I use Fair scheduler

I want Yarn to automatically add/remove executor cores, but no matter what I do it doesn't work

Relevant Spark configuration (configured in Ambari):

<code>spark.dynamicAllocation.schedu<wbr>lerBacklogTimeout 10s
spark.dynamicAllocation.sustai<wbr>nedSchedulerBacklogTimeout 5s
spark.driver.memory 4G
spark.dynamicAllocation.enable<wbr>d true
spark.dynamicAllocation.initia<wbr>lExecutors 6 (has no effect - starts with 2)
spark.dynamicAllocation.maxExe<wbr>cutors 10
spark.dynamicAllocation.minExe<wbr>cutors 1
spark.scheduler.mode FAIR
spark.shuffle.service.enabled true

Relevant Yarn configuration (configured in Ambari):

<code>yarn.nodemanager.aux-services mapreduce_shuffle,spark_shuffl<wbr>e,spark2_shuffle
YARN Java heap size 4096
yarn.resourcemanager.scheduler<wbr>.class org.apache.hadoop.yarn.server.<wbr>resourcemanager.scheduler.fair<wbr>.FairScheduler
yarn.scheduler.fair.preemption true
yarn.nodemanager.aux-services.<wbr>spark2_shuffle.classpath {{stack_root}}/${hdp.version}/<wbr>spark2/aux/*
yarn.nodemanager.aux-services.<wbr>spark_shuffle.classpath {{stack_root}}/${hdp.version}/<wbr>spark/aux/*
Minimum Container Size (VCores) 0
Maximum Container Size (VCores) 12 
Number of virtual cores 12

Also I followed the manual in and passed all the steps to configure external shuffle service, I copied the yarn-shuffle jar:

cp /usr/hdp/ /usr/hdp/

I see only 3 cores are allocated to the application (default executors is 2 so I guess its 2+driver) screenshot from the queue is attached, although many tasks are pending (screenshot added).

I want to get to a point where Yarn starts with 3 cpu for every application, but when there are pending tasks more resources are allocated.

If it it relevant, I use Jupyter Notebook and findspark to connect to the cluster:
import findspark
spark = SparkSession.builder.appName("internal-external2").getOrCreate()

I would really appreciate any suggestion/help, there is no manual on that topic I didn't try.
thx a lot,




@Anton P

Please double check that the number of executor instances has not been set in spark conf, command line or code. If set this will disable the dynamic allocation feature. From the Spark UI -> Environment tab check if the executor instances is set to a number.


*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.


Hi @Felix Albani and thx for the answer,

I have double checked and no executor instances configuration is set. More than that, no memory, cores or any such configuration is set for executor or driver.

I do want to repeat that I run the calculations from Jupyter Notebook, and it feels like the allocation is the default configuration.

I have tried running a program with spark-submit, and the results are the same