Support Questions
Find answers, ask questions, and share your expertise

HDFS Space not reclaimed

Expert Contributor

HDP-2.4.2.0-258 installed using Ambari 2.2.2.0

I executed the TestDFSIO on the cluster but it failed midway. The HDFS then had loads of data, the HDFS utilization was/is shown 98% in Ambari.

I simply deleted the benchmark directory created during the TestDFSIO AND expunged :

[hdfs@l4377t root]$ hdfs dfs -ls /benchmarks
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2016-08-08 12:48 /benchmarks/TestDFSIO
[hdfs@l4377t root]$
[hdfs@l4377t root]$
[hdfs@l4377t root]$ hdfs dfs -ls -h /benchmarks
Found 1 items
drwxr-xr-x   - hdfs hdfs          0 2016-08-08 12:48 /benchmarks/TestDFSIO
[hdfs@l4377t root]$
[hdfs@l4377t root]$ hdfs dfs -ls -h /benchmarks/TestDFSIO
Found 3 items
drwxr-xr-x   - hdfs hdfs          0 2016-08-08 12:48 /benchmarks/TestDFSIO/io_control
drwxr-xr-x   - hdfs hdfs          0 2016-08-08 12:58 /benchmarks/TestDFSIO/io_data
drwx--x--x   - hdfs hdfs          0 2016-08-08 13:02 /benchmarks/TestDFSIO/io_write
[hdfs@l4377t root]$
[hdfs@l4377t root]$ hdfs dfs -rmr /benchmarks/
rmr: DEPRECATED: Please use 'rm -r' instead.
16/08/09 09:15:33 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://l4283t.sss.com:8020/benchmarks' to trash at: hdfs://l4283t.sss.com:8020/user/hdfs/.Trash/Current
[hdfs@l4377t root]$
[hdfs@l4377t root]$
[hdfs@l4377t root]$ hdfs dfs -expunge
16/08/09 09:16:13 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 360 minutes, Emptier interval = 0 minutes.
16/08/09 09:16:13 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/hdfs/.Trash/160809091613

However, the disk and HDFS space is still not free, below is the df -h for the datanode directories:

/dev/vdc                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vdc
/dev/vdk                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vdk
/dev/vdl                     10G  8.9G  1.2G  89% /opt/hdfsdisks/vdl
/dev/vdh                     10G  8.9G  1.2G  89% /opt/hdfsdisks/vdh
/dev/vdg                     10G  8.9G  1.2G  89% /opt/hdfsdisks/vdg
/dev/vdj                     10G  8.9G  1.2G  89% /opt/hdfsdisks/vdj
/dev/vdi                     10G  8.9G  1.2G  89% /opt/hdfsdisks/vdi
/dev/vde                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vde
/dev/vdd                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vdd
/dev/vdb                     10G  8.9G  1.1G  90% /opt/hdfsdisks/vdb
/dev/vdm                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vdm
/dev/vdf                     10G  9.0G  1.1G  90% /opt/hdfsdisks/vdf

EDIT(@Benjamin Leonhardi can you check this ?): Output for those directories AFTER expunge

I tried expunge earlier as well as now, however, those TestDFSIO files keep on surfacing in trash !

-rw-------   3 hdfs      hdfs            0 2016-08-08 13:57 /user/hdfs/.Trash/160809091613/benchmarks/TestDFSIO/io_data/test_io_28
-rw-------   3 hdfs      hdfs            0 2016-08-08 13:57 /user/hdfs/.Trash/160809091613/benchmarks/TestDFSIO/io_data/test_io_26
-rw-------   3 hdfs      hdfs            0 2016-08-08 13:57 /user/hdfs/.Trash/160809091613/benchmarks/TestDFSIO/io_data/test_io_25
-rw-------   3 hdfs      hdfs            0 2016-08-08 13:57 /user/hdfs/.Trash/160809091613/benchmarks/TestDFSIO/io_data/test_io_24
-rw-------   3 hdfs      hdfs            0 2016-08-08 13:57 /user/hdfs/.Trash/160809091613/benchmarks/Te
1 ACCEPTED SOLUTION

Expert Contributor

Yeah I expunged but does it mean that the space reclaim will start 360 min.(6h) after deletion of the file ?

/*EDIT added after the space was auto-reclaimed*/

It seems strange but the space was reclaimed when I checked it today, probably, the reclaim did start after 6h 😞

View solution in original post

6 REPLIES 6

Expert Contributor

Yeah I expunged but does it mean that the space reclaim will start 360 min.(6h) after deletion of the file ?

/*EDIT added after the space was auto-reclaimed*/

It seems strange but the space was reclaimed when I checked it today, probably, the reclaim did start after 6h 😞

No the expunge should happen immediately, although HDFS may take a bit till the datanodes actually get around to delete the files but it shouldn't take long. So expunge doesn't help? Weird 🙂

You see that line:
  1. 16/08/0909:16:13 INFO fs.TrashPolicyDefault:Namenode trash configuration:Deletion interval =360 minutes,Emptier interval =0 minutes.

Per default HDFS uses a trash. You can bypass this with rm -skipTrash or just delete the trash with

hadoop fs -expunge

If you want to delete files and have them not go in the trash, use:

hdfs dfs -rm -r -skipTrash <directory name>

Expert Contributor

Yeah but what should I do about the files already in trash and expunge doesn't help ?

You can always remove the files in .Trash as you would any other directory/file.

hdfs dfs -rm -r -skipTrash /user/hdfs/.Trash/*
; ;