Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

.Trash is not happening with wasb(blob space)

avatar

Issue with Trash.

Configured the trash value fs.trash.interval is 360 min ...

However trash is not happening as per configuration that is 6hr,

I have deleted one file in 22/feb and one more file deleted on 23/march and one more deleted on 2/april ,

as per configuration in our trash it should contain only 2/April file, but in my case trash path contains all the old files.

Kindly let me know, how to configure the correct trash in wasb space.

Thanks,

Praveen

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Praveen Atmakuri

Here is some info I landed on in some forum

1) "fs.trash.interval" is not respected on blob storage.

2) "hadoop fs -expunge" creates a checkpoint instead of empty the trash

Apache documentation on 'fs -expunge' is a little confusing or inaccurate in simply stating it will "Empty the Trash".

The command will actually do two things:

a) delete all the old checkpoints that are older than the 'fs.trash.interval' config value.

b) create a new checkpoint of current Trash directory.

For Azure blog storage not respecting 'fs.trash.interval, there is a bug being tracked for this issue. There are some technical difficulties in solving the problem. In HDFS, namenode will enforce the interval config and clean up the Trash according to the config. In Azure blog storage, we don't have an HDFS namenode equivalent component that can enforce the rule.

Could you try to set it to 5 minutes and test just for curiosity?

View solution in original post

3 REPLIES 3

avatar
Master Mentor

@Praveen Atmakuri

If you change the configuration through Ambari did you restart the stale services?

You can use the -skipTrash option will bypass trash very handy in releasing diskspace in an emergency.

$ hdfs dfs -rm -R -skipTrash /xxxx/22/feb/*

If you set fs.trash.interval=60 min

That means the files you deleted to .trash directory will be cleared in exactly 1 hour

fs.trash.interval is the number of minutes after which the checkpoint gets deleted. It's advisable NOT to set it to 0 because that disables the configuration.

Hope that helps

avatar

skiptrash is fine, However why .trash is not working even i set the fs.trash.interval=360 min

Kindly let me know is there any way to configure the trash in wasb(Azure blob storage)

avatar
Master Mentor

@Praveen Atmakuri

Here is some info I landed on in some forum

1) "fs.trash.interval" is not respected on blob storage.

2) "hadoop fs -expunge" creates a checkpoint instead of empty the trash

Apache documentation on 'fs -expunge' is a little confusing or inaccurate in simply stating it will "Empty the Trash".

The command will actually do two things:

a) delete all the old checkpoints that are older than the 'fs.trash.interval' config value.

b) create a new checkpoint of current Trash directory.

For Azure blog storage not respecting 'fs.trash.interval, there is a bug being tracked for this issue. There are some technical difficulties in solving the problem. In HDFS, namenode will enforce the interval config and clean up the Trash according to the config. In Azure blog storage, we don't have an HDFS namenode equivalent component that can enforce the rule.

Could you try to set it to 5 minutes and test just for curiosity?