Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

[HDFS] - Non DFS used ?

avatar
Rising Star

Hi all,

what is the "Non DFS Used": it is some directories, files ??

hdfs dfsadmin -report # for one datanode

Rack: /SMTS/BC21
Decommission Status : Normal
Configured Capacity: 35974718423040 (32.72 TB)
DFS Used: 9748915679350 (8.87 TB)
Non DFS Used: 12141363006 (11.31 GB)
DFS Remaining: 26213661380684 (23.84 TB)
DFS Used%: 27.10%
DFS Remaining%: 72.87%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 16

df -k
/dev/sdb1            2928678656 798914704 2129763952  28% /var/opt/hosting/data/disk1
/dev/sdc1            2928678656 785083044 2143595612  27% /var/opt/hosting/data/disk2
/dev/sdd1            2928678656 798313760 2130364896  28% /var/opt/hosting/data/disk3
/dev/sde1            2928678656 799300600 2129378056  28% /var/opt/hosting/data/disk4
/dev/sdf1            2928678656 786169864 2142508792  27% /var/opt/hosting/data/disk5
/dev/sdg1            2928678656 799986864 2128691792  28% /var/opt/hosting/data/disk6
/dev/sdh1            2928678656 804181044 2124497612  28% /var/opt/hosting/data/disk7
/dev/sdi1            2928678656 789982192 2138696464  27% /var/opt/hosting/data/disk8
/dev/sdj1            2928678656 795951544 2132727112  28% /var/opt/hosting/data/disk9
/dev/sdk1            2928678656 789924668 2138753988  27% /var/opt/hosting/data/disk10
/dev/sdl1            2928678656 801960132 2126718524  28% /var/opt/hosting/data/disk11
/dev/sdm1            2928678656 782037100 2146641556  27% /var/opt/hosting/data/disk12
1 ACCEPTED SOLUTION

avatar
Super Guru
7 REPLIES 7

avatar

@mayki wogno

"Non DFS used" can be calculated by the following formula:

Non DFS Used = Configured Capacity - Remaining Space - DFS Used

Noting that Configured Capacity = Total Disk Space - Reserved Space

Therefore, Non DFS Used = (Total Disk Space - Reserved Space) - Remaining Space - DFS Used

Reserved Space is set by the property dfs.datanode.du.reserved

avatar
Super Guru

Here is good post which explans about non dfs used.

http://stackoverflow.com/questions/18477983/what-exactly-non-dfs-used-means

avatar
Rising Star

@slachterman : i've read this formula, and I know this property

But i want to know what exactly occupied this space ?

avatar

@mayki wogno

It's essentially non-HDFS data in dfs.datanode.data.dir. This could include log files, intermediate shuffle output from MapReduce jobs, local data files (if you put them on a data node), etc. You can use du or a similar tool to investigate further.

avatar
Rising Star

dfs.data.dirs is not define in my cluster so where non-HDFS data are stored ?

avatar
Rising Star

dfs.datanode.data.dirs is the same than dfs.data.dirs?

avatar

Sorry, corrected typo. dfs.datanode.data.dir is correct.