Support Questions

Find answers, ask questions, and share your expertise
Announcements
Now Live: Explore expert insights and technical deep dives on the new Cloudera Community BlogsRead the Announcement

In HDFS 3.3.5 storage utilisation is marked closer to 100%

avatar
Explorer

Hi all,
I'm facing an issue with HDFS 3.3.5 cluster.I have created a total of 650 files with replication set to 1 and block size set to 4MB and each file size is 4MB which should result in a subtotal of 2.5GB, but the storage utilisation is marked closer to 100% due to this.


Screenshot from 2025-03-05 11-32-40.png

Thanks.

2 REPLIES 2

avatar
Contributor

Hello

I had a similar problem in the past:  i had snapshots policies that did hourly snapshots and retain the last hours + 3 days + 4 weeks . the disk space was released after i deleted snapshots as the deletion of files doesn't delete the snapshoted hdfs data's blocs

i also emptied my HDFS Trash and checked that all users HDFS Trash DIRs where empty or small.

maybe tha could help 

Regards

avatar
Expert Contributor
➤ It sounds like you are encountering a common issue in HDFS where the metadata overhead and block minimums are causing a massive discrepancy between your actual data size and your disk utilization.
 
While 650 files at 4MB each technically equals 2.6GB of data, the way HDFS manages these on your physical disks (especially in smaller or test clusters) can lead to unexpected storage consumption.
 
➤ Root Causes of the 100% Utilization
 
1. Reserved Space and "Non-DFS Used"
HDFS does not have access to the entire disk. By default, Hadoop reserves a portion of the disk for the OS and non-Hadoop data (usually defined by dfs.datanode.du.reserved). If you are running on small disks (e.g., 20GB–50GB), the combination of your data, logs, and reserved space can quickly hit the 100% threshold.
 
2. Local Filesystem Block Overheads
Even though your HDFS block size is 4MB, your underlying OS filesystem (EXT4 or XFS) uses its own block size (usually 4KB). However, the metadata for 650 individual files, their checksums (.meta files), and the edit logs on the NameNode create a "death by a thousand cuts" scenario for small disks.
 
3. Log Accumulation
Check /var/log/hadoop or your configured log directory. In HDFS 3.3.5, if a cluster is struggling with space, the DataNodes and NameNodes generate massive amounts of "Heartbeat" and "Disk Full" logs, which consume the remaining Non-DFS space, pushing the disk to 100%.
 
➤How to Tackle the Situation
Step 1: Identify Where the Space Is Going
Run the following command to see if the space is taken by HDFS data or other files:
$ hdfs dfsadmin -report
 
DFS Used: Space taken by your 650 files.
Non-DFS Used: Space taken by logs, OS, and other applications. If this is high, your logs are the culprit.
 
 
Step 2: Clear Logs and Temporary Data
If "Non-DFS Used" is high, clear out the Hadoop log directory:
 
# Example path
rm -rf /var/log/hadoop/hdfs/*.log.*
rm -rf /var/log/hadoop/hdfs/*.out.*
 
Step 3: Adjust the "Disk Checked" Thresholds
By default, a DataNode stops working if the disk is 95% full. If you are in a test environment and need to squeeze out more space, you can lower the reserved space in hdfs-site.xml:
 
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>1073741824</value> 
</property>
 
Step 4: Combine Small Files (Long-term Fix)
HDFS is designed for large files.1 650 files of 4MB are considered "Small Files."
 
The Problem: Every file, regardless of size, takes up roughly 150 bytes of RAM on the NameNode and creates separate metadata entries.
 
The Solution: Use the getmerge command or a MapReduce/Spark job to combine these 650 files into 2 or 3 larger files (e.g., 1GB each).