Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Each Spark job execution reduces the free space on hard disk of a cluster


Each Spark job execution reduces the free space on hard disk of a cluster



I have the problem with free space in my cluster. There are 2 slave nodes and 1 master node. Each node has 30 Gb of space. I assume that it's quite enough for the processes that I am executing.

When I run Spark job abound 10-15 times, I notice that the free space in one of the slave node decreases dramatically. I start receiving red-coloured alerts in Ambari UI. The Spark job does not save any data in HDFS. It makes some intensive data processing. Also, I use `df.cache` a couple of times in the code, but then I run `unpersist(false)`. This is how I run my Spark job:


  --master yarn  

  --deploy-mode cluster  

  --driver-memory 6g  

  --executor-cores 2  

  --num-executors 2  

  --executor-memory 4g  

  --class org.test.GraphProcessor 


When I manually inspect the node from terminal, I see that there are a lot of garbage files stored in `spark2-history`, `.sparkStaging` and `/hadoop/yarn/local/usercache/hdfs` (as described here). I should manually delete all the content of these folders to make the cluster operative again. What is wrong with my settings of Ambari cluster? Shouldn't there be a continuous automated garbage cleaning after each execution of Spark job?

[hdfs@eureambarislave2 centos]$ hdfs dfs -du -h /user/hdfs/.sparkStaging
237.0 M  /user/hdfs/.sparkStaging/application_1525529485402_0157
237.0 M  /user/hdfs/.sparkStaging/application_1525529485402_0181
[hdfs@eureambarislave2 centos]$ hdfs dfs -rm -R -skipTrash /user/hdfs/.sparkStaging/*
Deleted /user/hdfs/.sparkStaging/application_1525529485402_0157
Deleted /user/hdfs/.sparkStaging/application_1525529485402_0181
Don't have an account?
Coming from Hortonworks? Activate your account here