I deleted many files from a hdfs subfolder, but no space was released according to hdfs dfs -du -h /hdfs_folder. The files deleted were above 70G, and with 3 replica, 210G space should be released, but in dfs -du output, there was no change at all.
Now about 18 hours passed, and I changed the following parameters on Cloudera, but there was still no change.
changed fs.trash.interval from 1 day to 20 minutes (from 1440 to 20)
changed fs.trash.checkpoint.interval from 1 hour to 20 minutes (from 60 to 20)
The other question is the size I calculated on each files size is 2TB less on bigger folder on which -du shows 8.4TB. How can I find which space has been used for other purposes? Thanks.
$ hdfs dfs -du -h /folder1
97.3 G 292.0 G /folder1/folder2
Below is the command I used:
1. list the files/folder, then based on the output, I calculated the total size of the files/folders deleted
hdfs dfs -ls -R /$hdfs_folder
output of the above command. I used size 4008 column to add the sizes of each files
-rw-r--r-- 3 ayasdi supergroup 4008 2018-03-07 18:48 /folder1/folder2/folder3
2. Delete folders
sudo -u hdfs hdfs dfs -rm -r -skipTrash $folder
I also ran expunge several times, but there is no change in the hdfs dfs -du output.
sudo -u hdfs hadoop fs -expunge