Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

spark history logs not deleted

avatar

hi all

we want to trigger the spark history deletion each hour ( spark.history.fs.cleaner.interval = 1h )

and to delete the spark history logs that older then one day ( spark.history.fs.cleaner.maxAge = 1d )

so we set the following:

76611-capture.png

and restart the spark services

but after 2 hours still logs not deleted in spite logs need to deleted after one hour according to our setings

why logs not deleted after one hour ?

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

I think you need to quite understand what the parameters mean once spark.history.fs.cleaner.enabled has been set to true

The Key/Value pairs Value is in Hour(s) /days

spark.history.fs.cleaner.maxAge =1d 

Job history files older than this will be deleted when the filesystem history cleaner runs.

spark.history.fs.cleaner.interval=1h 

This dictates how often the filesystem job history cleaner checks for files to delete.

Files are only deleted if they are older than spark.history.fs.cleaner.maxAge so in your case above the maxAge is 1 day=12h so expect the files to be deleted only then.

To test or validate you can change it to 2h and those files should be gone 3h

HTH

View solution in original post

4 REPLIES 4

avatar

@Michael Bronson

Do you see the <app-id>.inprogress files or the actual <app-id> files after applying the mentioned configs?is yes, trying filtering this files and check if you have any file older than 1 day. Also anything in spark history server logs?

avatar
Master Mentor

@Michael Bronson

I think you need to quite understand what the parameters mean once spark.history.fs.cleaner.enabled has been set to true

The Key/Value pairs Value is in Hour(s) /days

spark.history.fs.cleaner.maxAge =1d 

Job history files older than this will be deleted when the filesystem history cleaner runs.

spark.history.fs.cleaner.interval=1h 

This dictates how often the filesystem job history cleaner checks for files to delete.

Files are only deleted if they are older than spark.history.fs.cleaner.maxAge so in your case above the maxAge is 1 day=12h so expect the files to be deleted only then.

To test or validate you can change it to 2h and those files should be gone 3h

HTH

avatar
Master Mentor

@Michael Bronson

Any updates? If you found this answer addressed your question, please take a moment to log in and click the "accept" link on the answer.

avatar
Cloudera Employee

Hi,

 

We understand that logs are not getting deleted even though you had enabled spark.history.fs properties. Did you found any errors in SHS logs with regarding to this?

 

Thanks

AKR