Support Questions
Find answers, ask questions, and share your expertise

HDFS storage alert

Master Collaborator

I am getting alert on HDFS hdfs-storage-alert.jpg . Its saying

Remaining Capacity:[8112899521], Total Capacity:[82% Used, 46121660928]

The used capacity reported as 82% but the "df -h" command shows 68% used and the hdfs dfs -du -h shows 5% used ?

why all these discrepancies ?

[hdfs@hadoop1 ~]$ hdfs dfs -df -h
Filesystem                                    Size    Used  Available  Use%
hdfs://  214.8 G  11.0 G     93.0 G    5%
[hdfs@hadoop1 ~]$
[hdfs@hadoop1 ~]$
[root@hadoop1 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
                       50G   32G   16G  68% /
tmpfs                 5.9G     0  5.9G   0% /dev/shm
/dev/sda1             477M   72M  381M  16% /boot
                      144G  361M  136G   1% /home


Re: HDFS storage alert

Super Collaborator

hdfs dfs -df will report on the entire cluster storage, not one data node. On the other hand, df -h is only going to report on only one node, and not the cluster. So, they will not match. I can't explain the discrepancy in the 68% vs the 82%.

You may need to rebalance:

Re: HDFS storage alert

Master Collaborator

I did the rebalance but still the alert is there saying total capacity 82% used

Re: HDFS storage alert

@Sami Ahmad

Can you please check if some data is still present in the "/user/hdfs/.Trash"

You might get more details about "hdfs dfs -expunge" and "-skipTrash" option. As per

When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS moves it to a trash directory (each user has its own trash directory under /user/<username>/.Trash). The file can be restored quickly as long as it remains in trash. Most recent deleted files are moved to the current trash directory (/user/<username>/.Trash/Current), and in a configurable interval, HDFS creates checkpoints (under /user/<username>/.Trash/<date>) for files in current trash directory and deletes old checkpoints when they are expired. After the expiry of its life in trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.

Currently, the trash feature is disabled by default (deleting files without storing in trash). User can enable this feature by setting a value greater than zero for parameter fs.trash.interval (in core-site.xml). This value tells the NameNode how long a checkpoint will be expired and removed from HDFS. In addition, user can configure an appropriate time to tell NameNode how often to create checkpoints in trash (the parameter stored as fs.trash.checkpoint.interval in core-site.xml), this value should be smaller or equal to fs.trash.interval.