Created 06-27-2019 12:53 AM
Hi,
I deleted many files from a hdfs subfolder, but no space was released according to hdfs dfs -du -h /hdfs_folder. The files deleted were above 70G, and with 3 replica, 210G space should be released, but in dfs -du output, there was no change at all.
Now about 18 hours passed, and I changed the following parameters on Cloudera, but there was still no change.
changed fs.trash.interval from 1 day to 20 minutes (from 1440 to 20)
changed fs.trash.checkpoint.interval from 1 hour to 20 minutes (from 60 to 20)
The other question is the size I calculated on each files size is 2TB less on bigger folder on which -du shows 8.4TB. How can I find which space has been used for other purposes? Thanks.
du output:
$ hdfs dfs -du -h /folder1
...
97.3 G 292.0 G /folder1/folder2
Below is the command I used:
1. list the files/folder, then based on the output, I calculated the total size of the files/folders deleted
hdfs dfs -ls -R /$hdfs_folder
output of the above command. I used size 4008 column to add the sizes of each files
-rw-r--r-- 3 ayasdi supergroup 4008 2018-03-07 18:48 /folder1/folder2/folder3
2. Delete folders
sudo -u hdfs hdfs dfs -rm -r -skipTrash $folder
Created 06-27-2019 12:53 AM
I also ran expunge several times, but there is no change in the hdfs dfs -du output.
sudo -u hdfs hadoop fs -expunge
Created 10-30-2019 01:58 PM
To answer my original post, the reason is old snapshots created before the deletion reserved/booked the space. Once these snapshots were deleted, space was released.
Created 11-11-2019 07:49 PM
@tina_zy_qian Thanks for letting us know you solved your issue. If you could mark the reply as the solution (by clicking the Accept as Solution
button) it would help others in a similar situation find it in the future.
Created 06-02-2023 08:51 AM
Could you share the command please, how you deleted the old snapshots?