@R Pul Yes, that is a common problem. The first thing I would try is at the Spark configuration level, enable Dynamic Resource Allocation. Here is a description (from link below):
"Spark 1.2 introduces the ability to dynamically scale the set of cluster resources allocated to your application up and down based on the workload. This means that your application may give resources back to the cluster if they are no longer used and request them again later when there is demand. This feature is particularly useful if multiple applications share resources in your Spark cluster. If a subset of the resources allocated to an application becomes idle, it can be returned to the cluster’s pool of resources and acquired by other applications. In Spark, dynamic resource allocation is performed on the granularity of the executor and can be enabled through spark.dynamicAllocation.enabled ." And in particular, the Remove Policy: The policy for removing executors is much simpler. A Spark application removes an executor when it has been idle for more than spark.dynamicAllocation.executorIdleTimeout seconds. Web page:
Also, check out the paragraph entitled "Graceful Decommission of Executors" for more information.
... View more