Created 06-12-2019 06:09 AM
in spark2-history ,long running application generating logs with 30GB+ size.
how to control spark2-history size for each application .
Created 06-20-2019 07:23 AM
If you see your "/spark2-history" log folder has many old orphan files then probably these are the files left over due to spark-driver failures / crashes ...etc.
You can check your Spark configs following parameters.
spark.history.fs.cleaner.enabled=true spark.history.fs.cleaner.interval spark.history.fs.cleaner.maxAge
NOTE: However, there are some issues reported already for older version of Spark where the spark.history.fs.cleaner required some improvements.
As part of https://issues.apache.org/jira/browse/SPARK-8617 fix Spark 2.2 should function properly.
Also please check if the ownership of the files present inside the "/spark2-history" is correctly set or not? If not then please set it correctly according to your setup.
# hdfs dfs -chown spark:hadoop /spark2-history
.
Created 07-07-2019 10:32 PM
Did it help? Please update the thread. If you have any followup query ... of not then please mark this thread as answered by clicking on the "Accept" button.
Created 06-20-2019 01:21 PM
Why don't you set SPARK_DAEMON_MEMORY=2g in spark-env.sh ?
Created 07-08-2019 02:26 AM
The above question and the entire response thread below was originally posted in the Community Help track. On Mon Jul 8 02:24 UTC 2019, a member of the HCC moderation staff moved it to the Data Science & Advanced Analytics track. The Community Help Track is intended for questions about using the HCC site itself, not technical questions about administering Spark2.