in spark2-history ,long running application generating logs with 30GB+ size.
how to control spark2-history size for each application .
If you see your "/spark2-history" log folder has many old orphan files then probably these are the files left over due to spark-driver failures / crashes ...etc.
You can check your Spark configs following parameters.
spark.history.fs.cleaner.enabled=true spark.history.fs.cleaner.interval spark.history.fs.cleaner.maxAge
NOTE: However, there are some issues reported already for older version of Spark where the spark.history.fs.cleaner required some improvements.
As part of https://issues.apache.org/jira/browse/SPARK-8617 fix Spark 2.2 should function properly.
Also please check if the ownership of the files present inside the "/spark2-history" is correctly set or not? If not then please set it correctly according to your setup.
# hdfs dfs -chown spark:hadoop /spark2-history
The above question and the entire response thread below was originally posted in the Community Help track. On Mon Jul 8 02:24 UTC 2019, a member of the HCC moderation staff moved it to the Data Science & Advanced Analytics track. The Community Help Track is intended for questions about using the HCC site itself, not technical questions about administering Spark2.