Created 07-07-2016 12:23 PM
We're using a capacity scheduler on YARN with several queues. One of the queues is reserved for Spark notebooks (like jupyter/zeppelin). Many of our users leave their notebooks open for days on end. They are not using the resources they claimed (CPU and memory) most of the time.
What would be a good configuration for this use case? Is it possible to configure YARN/Spark in such a way that inactive notebooks do not hinder other users?
Created 07-07-2016 04:05 PM
@R Pul Yes, that is a common problem. The first thing I would try is at the Spark configuration level, enable Dynamic Resource Allocation. Here is a description (from link below):
"Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to your application up and down based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand. This feature is particularly useful if multiple applications share resources in your Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be returned to the cluster’s pool of resources and acquired by other applications. In Spark, dynamic resource allocation is performed on the granularity of the executor and can be enabled through spark.dynamicAllocation.enabled
."
And in particular, the Remove Policy:
The policy for removing executors is much simpler. A Spark application removes an executor when it has been idle for more thanspark.dynamicAllocation.executorIdleTimeout
seconds.
Web page:
https://spark.apache.org/docs/1.2.0/job-scheduling.html
Also, check out the paragraph entitled "Graceful Decommission of Executors" for more information.
Created 07-07-2016 04:05 PM
@R Pul Yes, that is a common problem. The first thing I would try is at the Spark configuration level, enable Dynamic Resource Allocation. Here is a description (from link below):
"Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to your application up and down based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand. This feature is particularly useful if multiple applications share resources in your Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be returned to the cluster’s pool of resources and acquired by other applications. In Spark, dynamic resource allocation is performed on the granularity of the executor and can be enabled through spark.dynamicAllocation.enabled
."
And in particular, the Remove Policy:
The policy for removing executors is much simpler. A Spark application removes an executor when it has been idle for more thanspark.dynamicAllocation.executorIdleTimeout
seconds.
Web page:
https://spark.apache.org/docs/1.2.0/job-scheduling.html
Also, check out the paragraph entitled "Graceful Decommission of Executors" for more information.