Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to check the total amount of data present on a hdp Cluster?

avatar
Rising Star

How to check the total amount of data present on a hdp Cluster?

1 ACCEPTED SOLUTION

avatar

@ANSARI FAHEEM AHMED

1) If you hover your mouse over the "HDFS Disk Usage" widget (upper left hand corner) in the Ambari Dashboard it will show you the following details:

DFS Used: Storage used for data

Non-DFS Used: Storage used for things such as logs, shuffle writes, etc...

Remaining: Remaining storage

5904-screen-shot-2016-07-20-at-42443-pm.png

2) From the command line you can also run "sudo -u hdfs hdfs dfsadmin -report", which will generate a full report of hdfs storage usage.

3) Finally, if you would like to check the disk usage for a particular folder (and sub folders), then you can use commands like "hadoop fsck", "hadoop fs -dus" or "hadoop fs -count -q". For an explanation of the differences between these commands as well as how to read the results please take a look at this post:

http://www.michael-noll.com/blog/2011/10/20/understanding-hdfs-quotas-and-hadoop-fs-and-fsck-tools/

View solution in original post

1 REPLY 1

avatar

@ANSARI FAHEEM AHMED

1) If you hover your mouse over the "HDFS Disk Usage" widget (upper left hand corner) in the Ambari Dashboard it will show you the following details:

DFS Used: Storage used for data

Non-DFS Used: Storage used for things such as logs, shuffle writes, etc...

Remaining: Remaining storage

5904-screen-shot-2016-07-20-at-42443-pm.png

2) From the command line you can also run "sudo -u hdfs hdfs dfsadmin -report", which will generate a full report of hdfs storage usage.

3) Finally, if you would like to check the disk usage for a particular folder (and sub folders), then you can use commands like "hadoop fsck", "hadoop fs -dus" or "hadoop fs -count -q". For an explanation of the differences between these commands as well as how to read the results please take a look at this post:

http://www.michael-noll.com/blog/2011/10/20/understanding-hdfs-quotas-and-hadoop-fs-and-fsck-tools/