Support Questions

johannes_kjellg · ‎08-03-2017

Hi,

Background

We are running a local mode Spark application that runs a spark job every 5 minutes using a singleton SparkContext.

- We are using Spark 1.6.2.2.4.3.0-227

- The application is long running and "sleeps" in-between the jobs

- We are using SparkContext.getOrCreate

- We are running spark in "local[*]" mode

- "spark.worker.cleanup.enabled" is set to "true"

- The application is written in Scala

- On failure we are invoking the Spark Context "stop" method in order to get a healthy SparkContext for the next job.

Problem

The "spark.local.dir" directory is filling up over time, and we eventually get "java.io.IOException: No space left on device".

-----------------

We found an old Jira ticket mentioning the issue (https://issues.apache.org/jira/browse/SPARK-7439), but it seems it was closed with the motivation "the dirs should already be cleaned up on JVM exit".

skurup · ‎08-10-2017

Are we closing the spark context here ? Usually a ".close()" call is done, the JVM should be able to clean up those directories .

johannes_kjellg · ‎08-14-2017

Hi Sumesh,
We are using a singleton spark context (SparkContext.getOrCreate). When the business logic fails we call ".stop()" (close is not available) on it to make sure a new one is created for the next run.

Cloudera Community

Support Questions

Apache Spark is not deleting the folders in the temporary directory (spark.local.dir)