I was testing spark dynamic resource allocation in spark. By default I see "spark-thrift-sparkconf.conf" contains all the dynamic allocation properties. But when I run the spark job "spark-shell --master yarn --num-executors 5 --executor-memory 3G", I expect it complain as I've requested number of executor in the job itself.
Then I modifed the custom spark-defaults.conf and added dynamic allocation properties:
spark.dynamicAllocation.enabled true spark.dynamicAllocation.initialExecutors 1 spark.dynamicAllocation.maxExecutors 5 spark.dynamicAllocation.minExecutors 1
And when I run the same job, I see below messages :
16/05/23 09:18:54 WARN SparkContext: Dynamic Allocation and num executors both set, thus dynamic allocation disabled.
Also print below messages if needed more resources. My doubt is is dynamic allocation is defined by default? Which config we should define dynamic allocation properties?
6/05/23 09:39:47 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 4) 16/05/23 09:39:48 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 5)
As per doc here when running spark on yarn.
"The number of executors. Note that this property is incompatible with
spark.dynamicAllocation.enabled. If both
spark.executor.instances are specified, dynamic allocation is turned off and the specified number of
spark.executor.instances is used".
Yes @Jitendra Yadav, I can see the same in the logs, spark.executor.instances overrides dynamic allocation properties. But my question is where we should define dynamic allocation settings, in spark-defaults.conf or spark-thrift-sparkconf.conf