Created 11-30-2017 10:08 AM
we want to set the parameter fs.trash.interval to zero
as
fs.trash.interval=o
because when we delete files then files then they go to the trash
example:
hdfs dfs -rm -R /spark2-history/*
and when we disable the fs.trash.interval , then files supposed to be delete without going to the trash
we want to set the fs.trash.interval to zero because in our lab , HDFS usage became high on ambari dashboard
and when we delete the files , actually files still stay in the trash and HDFS usage still high
so i guess if we set the fs.trash.interval to zero then files will not get to the trash
is it correct ?
second
by default fs.trash.interval=360 min
is it really means that every 360 min ( interval ) , files will be delete from the trash? , or its depend on other scenario?
Created 11-30-2017 10:17 AM
The correct definition of "fs.trash.interval" property is as following: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/core-default.xml
fs.trash.interval => Number of minutes after which the checkpoint gets deleted.
If zero, the trash feature is disabled. This option may be configured both on the server and the client.
If trash is disabled server side then the client side configuration is checked.
If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.
.
You can also use the -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.
hdfs dfs -rm -R -skipTrash /spark2-history/*
.
Created 11-30-2017 10:17 AM
The correct definition of "fs.trash.interval" property is as following: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/core-default.xml
fs.trash.interval => Number of minutes after which the checkpoint gets deleted.
If zero, the trash feature is disabled. This option may be configured both on the server and the client.
If trash is disabled server side then the client side configuration is checked.
If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.
.
You can also use the -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.
hdfs dfs -rm -R -skipTrash /spark2-history/*
.
Created 11-30-2017 10:31 AM
@Jay so can we summary that - when for example we set fs.trash.interval=60 min , then on the next interval all files in trash will be deleted?
Created 11-30-2017 10:39 AM
Yes, setting the parameter to 60 minutes will cause the trash to get cleared for the deleted content after 60 minutes.
Example: if we delete a file with name "/home/admin/test.txt" at 1:00 PM then with the 60 minutes trash interval that file will get cleared from the .Trash directory at 2:00 PM
.
But if you want immediate deletion then -skipTrash option will be best as it will bypass trash,