Support Questions

Find answers, ask questions, and share your expertise

Query on Disk Usage (DFS Used) in ambari UI

avatar
Rising Star

I am using HDP 2.6.4 and Ambari 2.6.1.5. In Ambari HDFS summary page there is a metric called "Disk Usage (DFS Used)" which in case is showing 19GB. If I do a hdfs dfs -du -h / it is giving a total of 6GB. Shouldn't these two results be the same, or I am missing something here?


1 ACCEPTED SOLUTION

avatar
Super Collaborator
@Vinuraj M

This is referring to the replication factor of HDFS which defaults to 3. This means that files you place on HDFS are stored 3 times on disks across the cluster for redundancy/node failure tolerance purposes. Therefore your 'du -h' will give you the sum of file sizes you have places on HDFS whereas the HDFS disk usage will give you the total disk space consumed.

6.XX GB * 3 replication factor = ~19 GB

View solution in original post

3 REPLIES 3

avatar
Super Collaborator
@Vinuraj M

This is referring to the replication factor of HDFS which defaults to 3. This means that files you place on HDFS are stored 3 times on disks across the cluster for redundancy/node failure tolerance purposes. Therefore your 'du -h' will give you the sum of file sizes you have places on HDFS whereas the HDFS disk usage will give you the total disk space consumed.

6.XX GB * 3 replication factor = ~19 GB

avatar
Rising Star

@anarasimham, thanks for the info, Any reference to documentation stating this ?

avatar
Super Collaborator

I couldn't find any documentation on this specific calculation, but you can understand it through testing as you have already. If you'd like to verify, insert a 2GB file into HDFS and get measurements before and after the insert. You should see the numbers change by the respective amounts.