Member since
02-17-2016
20
Posts
2
Kudos Received
0
Solutions
10-27-2025
05:15 PM
@fuchun As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. Thanks.
... View more
07-22-2020
05:45 AM
The given solution is certainly not true unfortunately. In HDFS a given block if it is open for write, then consumes 128MB that is true, but as soon as the file is closed, the last block of the file is counted just by the length of the file. So if you have a 1KB file, that consumes 3KB disk space considering replication factor 3, and if you have a 129MB file that consumes 387MB disk space again with replication factor 3. The phenomenon that can be seen in the output was most likely caused by other non-DFS disk usage, that made the available disk space for HDFS less, and had nothing to do with the file sizes. Just to demonstrate this with a 1KB test file: # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 120 K 27.1 T 0% # fallocate -l 1024 test.txt # hdfs dfs -put test.txt /tmp # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 123.0 K 27.1 T 0% I hope this helps to clarify and correct this answer.
... View more
04-09-2018
08:51 AM
Hi @Harsh J, thank you for even more thorough answer, placement policy is clear now. I didn't see the risk of rogue Yarn apps before, it's very helpful. Many thanks!
... View more