Created 04-02-2018 12:11 PM
Issue with Trash.
Configured the trash value fs.trash.interval is 360 min ...
However trash is not happening as per configuration that is 6hr,
I have deleted one file in 22/feb and one more file deleted on 23/march and one more deleted on 2/april ,
as per configuration in our trash it should contain only 2/April file, but in my case trash path contains all the old files.
Kindly let me know, how to configure the correct trash in wasb space.
Thanks,
Praveen
Created 04-03-2018 08:09 AM
Here is some info I landed on in some forum
1) "fs.trash.interval" is not respected on blob storage.
2) "hadoop fs -expunge" creates a checkpoint instead of empty the trash
Apache documentation on 'fs -expunge' is a little confusing or inaccurate in simply stating it will "Empty the Trash".
The command will actually do two things:
a) delete all the old checkpoints that are older than the 'fs.trash.interval' config value.
b) create a new checkpoint of current Trash directory.
For Azure blog storage not respecting 'fs.trash.interval, there is a bug being tracked for this issue. There are some technical difficulties in solving the problem. In HDFS, namenode will enforce the interval config and clean up the Trash according to the config. In Azure blog storage, we don't have an HDFS namenode equivalent component that can enforce the rule.
Could you try to set it to 5 minutes and test just for curiosity?
Created 04-02-2018 02:33 PM
If you change the configuration through Ambari did you restart the stale services?
You can use the -skipTrash option will bypass trash very handy in releasing diskspace in an emergency.
$ hdfs dfs -rm -R -skipTrash /xxxx/22/feb/*
If you set fs.trash.interval=60 min
That means the files you deleted to .trash directory will be cleared in exactly 1 hour
fs.trash.interval is the number of minutes after which the checkpoint gets deleted. It's advisable NOT to set it to 0 because that disables the configuration.
Hope that helps
Created 04-03-2018 07:34 AM
skiptrash is fine, However why .trash is not working even i set the fs.trash.interval=360 min
Kindly let me know is there any way to configure the trash in wasb(Azure blob storage)
Created 04-03-2018 08:09 AM
Here is some info I landed on in some forum
1) "fs.trash.interval" is not respected on blob storage.
2) "hadoop fs -expunge" creates a checkpoint instead of empty the trash
Apache documentation on 'fs -expunge' is a little confusing or inaccurate in simply stating it will "Empty the Trash".
The command will actually do two things:
a) delete all the old checkpoints that are older than the 'fs.trash.interval' config value.
b) create a new checkpoint of current Trash directory.
For Azure blog storage not respecting 'fs.trash.interval, there is a bug being tracked for this issue. There are some technical difficulties in solving the problem. In HDFS, namenode will enforce the interval config and clean up the Trash according to the config. In Azure blog storage, we don't have an HDFS namenode equivalent component that can enforce the rule.
Could you try to set it to 5 minutes and test just for curiosity?