- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Apache Spark is not deleting the folders in the temporary directory (spark.local.dir)
- Labels:
-
Apache Spark
Created 08-03-2017 10:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Background
We are running a local mode Spark application that runs a spark job every 5 minutes using a singleton SparkContext.
- We are using Spark 1.6.2.2.4.3.0-227
- The application is long running and "sleeps" in-between the jobs
- We are using SparkContext.getOrCreate
- We are running spark in "local[*]" mode
- "spark.worker.cleanup.enabled" is set to "true"
- The application is written in Scala
- On failure we are invoking the Spark Context "stop" method in order to get a healthy SparkContext for the next job.
Problem
The "spark.local.dir" directory is filling up over time, and we eventually get "java.io.IOException: No space left on device".
-----------------
We found an old Jira ticket mentioning the issue (https://issues.apache.org/jira/browse/SPARK-7439), but it seems it was closed with the motivation "the dirs should already be cleaned up on JVM exit".
Created 08-10-2017 12:49 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are we closing the spark context here ? Usually a ".close()" call is done, the JVM should be able to clean up those directories .
Created 08-14-2017 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sumesh,
We are using a singleton spark context (SparkContext.getOrCreate). When the business logic fails we call ".stop()" (close is not available) on it to make sure a new one is created for the next run.
