Created on 06-10-2018 10:13 AM - edited 08-17-2019 07:40 PM
hi all
we want to trigger the spark history deletion each hour ( spark.history.fs.cleaner.interval = 1h )
and to delete the spark history logs that older then one day ( spark.history.fs.cleaner.maxAge = 1d )
so we set the following:
and restart the spark services
but after 2 hours still logs not deleted in spite logs need to deleted after one hour according to our setings
why logs not deleted after one hour ?
Created 06-10-2018 07:46 PM
I think you need to quite understand what the parameters mean once spark.history.fs.cleaner.enabled has been set to true
The Key/Value pairs Value is in Hour(s) /days
spark.history.fs.cleaner.maxAge =1d
Job history files older than this will be deleted when the filesystem history cleaner runs.
spark.history.fs.cleaner.interval=1h
This dictates how often the filesystem job history cleaner checks for files to delete.
Files are only deleted if they are older than spark.history.fs.cleaner.maxAge so in your case above the maxAge is 1 day=12h so expect the files to be deleted only then.
To test or validate you can change it to 2h and those files should be gone 3h
HTH
Created 06-10-2018 07:35 PM
Do you see the <app-id>.inprogress files or the actual <app-id> files after applying the mentioned configs?is yes, trying filtering this files and check if you have any file older than 1 day. Also anything in spark history server logs?
Created 06-10-2018 07:46 PM
I think you need to quite understand what the parameters mean once spark.history.fs.cleaner.enabled has been set to true
The Key/Value pairs Value is in Hour(s) /days
spark.history.fs.cleaner.maxAge =1d
Job history files older than this will be deleted when the filesystem history cleaner runs.
spark.history.fs.cleaner.interval=1h
This dictates how often the filesystem job history cleaner checks for files to delete.
Files are only deleted if they are older than spark.history.fs.cleaner.maxAge so in your case above the maxAge is 1 day=12h so expect the files to be deleted only then.
To test or validate you can change it to 2h and those files should be gone 3h
HTH
Created 06-10-2018 10:59 PM
Any updates? If you found this answer addressed your question, please take a moment to log in and click the "accept" link on the answer.
Created 01-07-2020 01:41 AM
Hi,
We understand that logs are not getting deleted even though you had enabled spark.history.fs properties. Did you found any errors in SHS logs with regarding to this?
Thanks
AKR