Member since
05-12-2024
6
Posts
3
Kudos Received
0
Solutions
01-09-2026
11:52 PM
➤ It sounds like you are encountering a common issue in HDFS where the metadata overhead and block minimums are causing a massive discrepancy between your actual data size and your disk utilization. While 650 files at 4MB each technically equals 2.6GB of data, the way HDFS manages these on your physical disks (especially in smaller or test clusters) can lead to unexpected storage consumption. ➤ Root Causes of the 100% Utilization 1. Reserved Space and "Non-DFS Used" HDFS does not have access to the entire disk. By default, Hadoop reserves a portion of the disk for the OS and non-Hadoop data (usually defined by dfs.datanode.du.reserved). If you are running on small disks (e.g., 20GB–50GB), the combination of your data, logs, and reserved space can quickly hit the 100% threshold. 2. Local Filesystem Block Overheads Even though your HDFS block size is 4MB, your underlying OS filesystem (EXT4 or XFS) uses its own block size (usually 4KB). However, the metadata for 650 individual files, their checksums (.meta files), and the edit logs on the NameNode create a "death by a thousand cuts" scenario for small disks. 3. Log Accumulation Check /var/log/hadoop or your configured log directory. In HDFS 3.3.5, if a cluster is struggling with space, the DataNodes and NameNodes generate massive amounts of "Heartbeat" and "Disk Full" logs, which consume the remaining Non-DFS space, pushing the disk to 100%. ➤How to Tackle the Situation Step 1: Identify Where the Space Is Going Run the following command to see if the space is taken by HDFS data or other files: $ hdfs dfsadmin -report DFS Used: Space taken by your 650 files. Non-DFS Used: Space taken by logs, OS, and other applications. If this is high, your logs are the culprit. Step 2: Clear Logs and Temporary Data If "Non-DFS Used" is high, clear out the Hadoop log directory: # Example path rm -rf /var/log/hadoop/hdfs/*.log.* rm -rf /var/log/hadoop/hdfs/*.out.* Step 3: Adjust the "Disk Checked" Thresholds By default, a DataNode stops working if the disk is 95% full. If you are in a test environment and need to squeeze out more space, you can lower the reserved space in hdfs-site.xml: <property> <name>dfs.datanode.du.reserved</name> <value>1073741824</value> </property> Step 4: Combine Small Files (Long-term Fix) HDFS is designed for large files.1 650 files of 4MB are considered "Small Files." The Problem: Every file, regardless of size, takes up roughly 150 bytes of RAM on the NameNode and creates separate metadata entries. The Solution: Use the getmerge command or a MapReduce/Spark job to combine these 650 files into 2 or 3 larger files (e.g., 1GB each).
... View more
07-17-2024
09:08 AM
1 Kudo
When the local NameNode is healthy, the ZKFC holds a session open in ZooKeeper. If the local NameNode is active, it also holds a special “lock” znode. This lock uses ZooKeeper’s support for “ephemeral” nodes; if the session expires, the lock node will be automatically deleted. Ephemeral znodes - These znodes exists as long as the session that created the znode is active. When the session ends the znode is deleted. Zookeeper has a concept of Watches which can be set on znodes to track them for any changes. Check the link how Watches work in zookeeper - https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#ch_zkWatches
... View more
05-24-2024
02:34 AM
Hi @NaveenBlaze , You can get more info from https://github.com/c9n/hadoop/blob/master/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java#L196 . Notice these two lines in this method doTailEdits FSImage image = namesystem.getFSImage(); streams = editLog.selectInputStreams(lastTxnId + 1, 0, null, false); editsLoaded = image.loadEdits(streams, namesystem);
... View more