Member since
02-17-2016
20
Posts
2
Kudos Received
0
Solutions
07-22-2020
05:45 AM
The given solution is certainly not true unfortunately. In HDFS a given block if it is open for write, then consumes 128MB that is true, but as soon as the file is closed, the last block of the file is counted just by the length of the file. So if you have a 1KB file, that consumes 3KB disk space considering replication factor 3, and if you have a 129MB file that consumes 387MB disk space again with replication factor 3. The phenomenon that can be seen in the output was most likely caused by other non-DFS disk usage, that made the available disk space for HDFS less, and had nothing to do with the file sizes. Just to demonstrate this with a 1KB test file: # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 120 K 27.1 T 0% # fallocate -l 1024 test.txt # hdfs dfs -put test.txt /tmp # hdfs dfs -df -h Filesystem Size Used Available Use% hdfs://<nn>:8020 27.1 T 123.0 K 27.1 T 0% I hope this helps to clarify and correct this answer.
... View more
03-17-2019
09:03 PM
This enables the diskbalancer feature on a cluster. By default, disk balancer is disabled. then why is your config + <value>false</value>
... View more
04-09-2018
08:51 AM
Hi @Harsh J, thank you for even more thorough answer, placement policy is clear now. I didn't see the risk of rogue Yarn apps before, it's very helpful. Many thanks!
... View more