Support Questions

Find answers, ask questions, and share your expertise

[HDFS] - Non DFS used ?

avatar
Rising Star

Hi all,

what is the "Non DFS Used": it is some directories, files ??

hdfs dfsadmin -report # for one datanode

Rack: /SMTS/BC21
Decommission Status : Normal
Configured Capacity: 35974718423040 (32.72 TB)
DFS Used: 9748915679350 (8.87 TB)
Non DFS Used: 12141363006 (11.31 GB)
DFS Remaining: 26213661380684 (23.84 TB)
DFS Used%: 27.10%
DFS Remaining%: 72.87%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 16

df -k
/dev/sdb1            2928678656 798914704 2129763952  28% /var/opt/hosting/data/disk1
/dev/sdc1            2928678656 785083044 2143595612  27% /var/opt/hosting/data/disk2
/dev/sdd1            2928678656 798313760 2130364896  28% /var/opt/hosting/data/disk3
/dev/sde1            2928678656 799300600 2129378056  28% /var/opt/hosting/data/disk4
/dev/sdf1            2928678656 786169864 2142508792  27% /var/opt/hosting/data/disk5
/dev/sdg1            2928678656 799986864 2128691792  28% /var/opt/hosting/data/disk6
/dev/sdh1            2928678656 804181044 2124497612  28% /var/opt/hosting/data/disk7
/dev/sdi1            2928678656 789982192 2138696464  27% /var/opt/hosting/data/disk8
/dev/sdj1            2928678656 795951544 2132727112  28% /var/opt/hosting/data/disk9
/dev/sdk1            2928678656 789924668 2138753988  27% /var/opt/hosting/data/disk10
/dev/sdl1            2928678656 801960132 2126718524  28% /var/opt/hosting/data/disk11
/dev/sdm1            2928678656 782037100 2146641556  27% /var/opt/hosting/data/disk12
1 ACCEPTED SOLUTION

avatar
Super Guru
7 REPLIES 7

avatar

@mayki wogno

"Non DFS used" can be calculated by the following formula:

Non DFS Used = Configured Capacity - Remaining Space - DFS Used

Noting that Configured Capacity = Total Disk Space - Reserved Space

Therefore, Non DFS Used = (Total Disk Space - Reserved Space) - Remaining Space - DFS Used

Reserved Space is set by the property dfs.datanode.du.reserved

avatar
Super Guru

Here is good post which explans about non dfs used.

http://stackoverflow.com/questions/18477983/what-exactly-non-dfs-used-means

avatar
Rising Star

@slachterman : i've read this formula, and I know this property

But i want to know what exactly occupied this space ?

avatar

@mayki wogno

It's essentially non-HDFS data in dfs.datanode.data.dir. This could include log files, intermediate shuffle output from MapReduce jobs, local data files (if you put them on a data node), etc. You can use du or a similar tool to investigate further.

avatar
Rising Star

dfs.data.dirs is not define in my cluster so where non-HDFS data are stored ?

avatar
Rising Star

dfs.datanode.data.dirs is the same than dfs.data.dirs?

avatar

Sorry, corrected typo. dfs.datanode.data.dir is correct.