My HDFS has total disk space of 28.2 TB, which I have 15.1TB useful data on it. After a while, Ambari reports the disk space is 75% full, so I started "Balance HDFS" from Ambari. Since then, the available disk space decrease slowly until they are all gone. Now I have no more useful disk space. How can I reclaim the unused disk space.
hdfs@msl-dpe-perf88:/$ hdfs dfs -du -h -s / 15.1 T / hdfs@msl-dpe-perf88:/$ hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://msl-dpe-perf88.msl.lab:8020 28.2 T 27.1 T 0 96%
When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. The retention time in the /trash is configurable. After the expiry of its life in /trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.
If you want to change the default setting then it needs to be updated in the core-site properties, which you can find in the Ambari menu. Simply follow this path; from the Ambari Dashboard, click HDFS -> Configs -> Advanced -> Advanced core-site. Then set the 'fs.trash.interval' to 0 to disable. This will require a restart of the related components to pick up the changes.
Check the HDFS structure to see where the most data is held. This will give you the space on each data node
$ hdfs dfsadmin -report
Breakdown of the HDFS across the cluster and each of the data nodes run the below command, you should give it some time to complete.
$ hdfs dfs -expunge
By default, HDFS uses trash. You can bypass this with rm -skipTrash or just delete the trash with The other option when cleaning up your data use the -skipTrash flag:
$ hdfs dfs -rm -R -skipTrash /folder-path
I had been using -skipTrash options when I deleting files and /user/hdfs/.Trash directory is empty. I had also used -expunge command 24 hours ago. I still did not see disk space being freed. Here is results from dfsadm command
hdfs@msl-dpe-perf88:/$ hdfs dfs -ls /user/hdfs/.Trash hdfs@msl-dpe-perf88:/$ hdfs@msl-dpe-perf88:/$ hdfs dfsadmin -report Configured Capacity: 31048107810816 (28.24 TB) Present Capacity: 29767722012672 (27.07 TB) DFS Remaining: 0 (0 B) DFS Used: 29767722012672 (27.07 TB) DFS Used%: 100.00% Under replicated blocks: 97449 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0 -------------------------------------------------
How many data nodes do you have in your cluster?
Can you try to isolate the culprit
$ hdfs dfs -du -h /
If you enabled snapshots then that could be one reason can you check its existence?
$ hdfs lsSnapshottableDir
I have 4 data nodes and no snapshots set. Here are the output from the commands
hdfs@msl-dpe-perf88:/$ hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://msl-dpe-perf88.msl.lab:8020 28.2 T 27.1 T 0 96% hdfs@msl-dpe-perf88:/$ hdfs lsSnapshottableDir hdfs@msl-dpe-perf88:/$