Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

set fs.trash.interval to zero in order to avoid files to be in trash

avatar

we want to set the parameter fs.trash.interval to zero

as

fs.trash.interval=o

because when we delete files then files then they go to the trash

example:

hdfs dfs -rm -R /spark2-history/*

and when we disable the fs.trash.interval , then files supposed to be delete without going to the trash

we want to set the fs.trash.interval to zero because in our lab , HDFS usage became high on ambari dashboard

and when we delete the files , actually files still stay in the trash and HDFS usage still high

so i guess if we set the fs.trash.interval to zero then files will not get to the trash

is it correct ?

second

by default fs.trash.interval=360 min

is it really means that every 360 min ( interval ) , files will be delete from the trash? , or its depend on other scenario?

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

The correct definition of "fs.trash.interval" property is as following: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/core-default.xml

fs.trash.interval => Number of minutes after which the checkpoint gets deleted.

If zero, the trash feature is disabled. This option may be configured both on the server and the client.

If trash is disabled server side then the client side configuration is checked.

If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.

.

You can also use the -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.

hdfs dfs -rm -R -skipTrash /spark2-history/*

.

View solution in original post

3 REPLIES 3

avatar
Master Mentor

@Michael Bronson

The correct definition of "fs.trash.interval" property is as following: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/core-default.xml

fs.trash.interval => Number of minutes after which the checkpoint gets deleted.

If zero, the trash feature is disabled. This option may be configured both on the server and the client.

If trash is disabled server side then the client side configuration is checked.

If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.

.

You can also use the -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.

hdfs dfs -rm -R -skipTrash /spark2-history/*

.

avatar

@Jay so can we summary that - when for example we set fs.trash.interval=60 min , then on the next interval all files in trash will be deleted?

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Yes, setting the parameter to 60 minutes will cause the trash to get cleared for the deleted content after 60 minutes.

Example: if we delete a file with name "/home/admin/test.txt" at 1:00 PM then with the 60 minutes trash interval that file will get cleared from the .Trash directory at 2:00 PM

.

But if you want immediate deletion then -skipTrash option will be best as it will bypass trash,