Created 06-28-2016 02:26 PM
Hi all,
what is the "Non DFS Used": it is some directories, files ??
hdfs dfsadmin -report # for one datanode Rack: /SMTS/BC21 Decommission Status : Normal Configured Capacity: 35974718423040 (32.72 TB) DFS Used: 9748915679350 (8.87 TB) Non DFS Used: 12141363006 (11.31 GB) DFS Remaining: 26213661380684 (23.84 TB) DFS Used%: 27.10% DFS Remaining%: 72.87% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 16 df -k /dev/sdb1 2928678656 798914704 2129763952 28% /var/opt/hosting/data/disk1 /dev/sdc1 2928678656 785083044 2143595612 27% /var/opt/hosting/data/disk2 /dev/sdd1 2928678656 798313760 2130364896 28% /var/opt/hosting/data/disk3 /dev/sde1 2928678656 799300600 2129378056 28% /var/opt/hosting/data/disk4 /dev/sdf1 2928678656 786169864 2142508792 27% /var/opt/hosting/data/disk5 /dev/sdg1 2928678656 799986864 2128691792 28% /var/opt/hosting/data/disk6 /dev/sdh1 2928678656 804181044 2124497612 28% /var/opt/hosting/data/disk7 /dev/sdi1 2928678656 789982192 2138696464 27% /var/opt/hosting/data/disk8 /dev/sdj1 2928678656 795951544 2132727112 28% /var/opt/hosting/data/disk9 /dev/sdk1 2928678656 789924668 2138753988 27% /var/opt/hosting/data/disk10 /dev/sdl1 2928678656 801960132 2126718524 28% /var/opt/hosting/data/disk11 /dev/sdm1 2928678656 782037100 2146641556 27% /var/opt/hosting/data/disk12
Created 06-28-2016 02:34 PM
Here is good post which explans about non dfs used.
http://stackoverflow.com/questions/18477983/what-exactly-non-dfs-used-means
Created 06-28-2016 02:34 PM
"Non DFS used" can be calculated by the following formula:
Non DFS Used = Configured Capacity - Remaining Space - DFS Used
Noting that Configured Capacity = Total Disk Space - Reserved Space
Therefore, Non DFS Used = (Total Disk Space - Reserved Space) - Remaining Space - DFS Used
Reserved Space is set by the property dfs.datanode.du.reserved
Created 06-28-2016 02:34 PM
Here is good post which explans about non dfs used.
http://stackoverflow.com/questions/18477983/what-exactly-non-dfs-used-means
Created 06-28-2016 02:38 PM
@slachterman : i've read this formula, and I know this property
But i want to know what exactly occupied this space ?
Created 06-28-2016 02:42 PM
It's essentially non-HDFS data in dfs.datanode.data.dir. This could include log files, intermediate shuffle output from MapReduce jobs, local data files (if you put them on a data node), etc. You can use du or a similar tool to investigate further.
Created 06-28-2016 02:52 PM
dfs.data.dirs is not define in my cluster so where non-HDFS data are stored ?
Created 06-28-2016 02:55 PM
dfs.datanode.data.dirs is the same than dfs.data.dirs?
Created 06-28-2016 06:25 PM
Sorry, corrected typo. dfs.datanode.data.dir is correct.