About pifta

pifta · ‎07-22-2020

The given solution is certainly not true unfortunately. In HDFS a given block if it is open for write, then consumes 128MB that is true, but as soon as the file is closed, the last block of the file is counted just by the length of the file. So if you have a 1KB file, that consumes 3KB disk space considering replication factor 3, and if you have a 129MB file that consumes 387MB disk space again with replication factor 3. The phenomenon that can be seen in the output was most likely caused by other non-DFS disk usage, that made the available disk space for HDFS less, and had nothing to do with the file sizes. Just to demonstrate this with a 1KB test file: # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 120 K 27.1 T 0% # fallocate -l 1024 test.txt # hdfs dfs -put test.txt /tmp # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 123.0 K 27.1 T 0% I hope this helps to clarify and correct this answer.

pifta · ‎03-12-2018

Hi @lizard, if HDFS DataNode reaches max capacity on a disk, it will not use it, as the allocation of a new block is checking the available space on the disk. This check is considering the dfs.du.reserve setting as well, so if you reserve for example 10GB of space, and a disk has less the 10GB+blocksize free space, a block allocation will not happen on the disk. If a DataNode is completely full, and there are no further disks where at least one block can be allocated, that can cause block allocation issues on the HDFS level. Also if no disk space is available, that can result in issues on the DataNode level during internal DataNode operations, that is why we suggest to size a cluster in a way that you have about 25% free space available as a good minimum. Cheers, Pifta

pifta · ‎03-07-2018

Hi Koc, this seems to be pretty much interesting, as the exception based on the code seems to be a result of a race condition, as the getDiskBalancerStatus call is the following in the code: @Override // DataNodeMXBean public String getDiskBalancerStatus() { try { return this.diskBalancer.queryWorkStatus().toJsonString(); } catch (IOException ex) { LOG.debug("Reading diskbalancer Status failed. ex:{}", ex); return ""; } } So that NullPointerException can happen when the diskBalancer is null, or if queryWorkStatus() returns a null. queryWorkStatus() throws an IOException when the disk Balancer is not enabled, and that is why disabling the disk balancer fixes the issue. Otherwise queryWorkStatus seems to always return a reference. This is why I suspect a race condition that causes the diskBalancer reference to be null in the DataNode object when the getDiskBalancerStatus method is called. As the getDiskBalancerStatus method is exposed to the JMX interface, this method is called, when the DataNode's JMX interface is being queried, and should not prevent the DataNode startup. So this seems to be something that should not fail the DataNode startup, do you still have the startup logs for this issue, when the DataNode startup failed? Is there anything else that is reported as an error, or fatal? If you do have the DataNode standard error output (on a CDH cluster it is in /var/run/cloudera-scm-agent/process/xxx-DATANODE/logs folders) for a failed start, then that might as well contain some other traces about the problem, would you please check it, it would be nice to track this down, and if it is a bug fix it. Thanks! Istvan

pifta · ‎03-07-2018

Hi Lizard, in the linked documentation you could have find good data, however in case of immediate need to rebalance disks inside the DataNode you can as well run a disk balancer (note that this is different from the HDFS Balancer). Disk balancer info are here: https://blog.cloudera.com/blog/2016/10/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/ Cheers, Pifta

Online	Offline
Last Visited	‎07-17-2024 08:19 AM

Member Since	‎02-17-2016 09:27 AM
Last Visited	‎07-17-2024 08:19 AM
Posts	20
Kudos received	2

Cloudera Community

Re: hdfs disk usage question

Re: hdfs: max utilization on a single disk

Re: Datanode is not connecting to namenode (CDH 5....

Re: hdfs: max utilization on a single disk