Created 07-29-2017 05:28 PM
Hi,
Please correct me if I am wrong. As per my understanding Spark can run equal no. of containers to that of Executors. When we run a job with 600 executors, the spark trying to run more containers roughly around 700 causing the job fail as my queue allow ro run upto 620 containers. Please help me in this issue. As per my understanding all tasks are executed in spark executor containers. Then, how come these extra containers got launched
Created 07-31-2017 04:49 AM
Hi @mravipati,
can you please check Dynamic Resource Allocation is enabled
spark.dynamicAllocation.enabled =true
this will use as many as it can depends up on the system rescue availability, this may be causing the problem
On the other note, this behaviour can be controlled by setting the
spark.dynamicAllocation.maxExecutors = <no max limit>
please note that, driver also allocated some of the containers. you need to manage the memory allocations for Executors and drivers.
for instance if you have Yarn minimum container size mentioned as 2GB and your executors are requested about 2GB per executor, this will allocated 4GB per executor as you have spark.yarn.executor.memoryOverhead also to be accounted.
the following KB explain more about the why it is taking more resources by spark.
Created 07-31-2017 07:06 AM
Hi,
dynamic allocation is enabled and the value of max dynamic allocation executors is set to 10.
My yarn container min. size is 4GB, executor size is 10 GB, overhead memory is 384MB, so the containers are launching with 12GB. We are ok with 12 GB container size and also we are clear on the memory utilization. But, we failed to understand how spark is launching more containers than the no. of executors. I think for driver also it uses AM container rather than launching separate container for the driver process.