Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

Highlighted

how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

Expert Contributor

how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

Name: ***.***.***.***:*** (***)
Hostname: ***
Decommission Status : Normal
Configured Capacity: 83476791296 (77.74 GB)
DFS Used: 1003606016 (957.11 MB)
Non DFS Used: 11966496768 (11.14 GB)
DFS Remaining: 70506688512 (65.66 GB)
DFS Used%: 1.20%
DFS Remaining%: 84.46%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: ***
5 REPLIES 5

Re: how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

@Saumil Mayani

The filesystems that are specified for the datanode directories do not necessarily only contain HDFS data. For example, you may have /data01 as the mount point for your datanode with some other files in /data01/temp or something like that. The file sin /data01/datanode will be the portion that is "DFS Used", the portion in other directories on /data01 will be "Non DFS Used". The "DFS Remaining" will be the balance:

DFS Remaining = FS Size - DFS Used - Non DFS Used

Re: how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

Expert Contributor

@emaxwell In the above formula, FS Size, DFS Used and Non-DFS Used are known based on physical disk usage. However, I notice that when datanode is restarted, I see that the "Non DFS Used" goes down and 'DFS Remaining" goes up. how often, FS Size, DFS Used, Non DFS Used recorded?

Re: how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

@Saumil Mayani

When you run "dfsadmin -report", it gathers the information. There may be temp directories on the disk where jobs are storing data, or there could be temp files within HDFS that are getting removed on a restart. The amount of space is fluid and collected when you ask for the report.

Re: how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

Expert Contributor

@emaxwell this could not be fluid upon request as NameNode would also use this information to determine if the datanode has any space available to use for client to write data?

Re: how is "DFS Remaining" in "hdfs dfsadmin -report" computed?

@Saumil Mayani

There is constant heartbeat, block reports, and other information exchange between the namenode and the datanodes to keep track of where blocks are located, available space, under replicated blocks, etc. When you run a "dfsadmin -report", it uses the current information that the namenode has. This information is updated regularly. If you restart HDFS, each datanode takes an inventory and reports back to the namenode. If temporary files have been removed on restart, this will be reflected in the block reports back to the namenode.

Don't have an account?
Coming from Hortonworks? Activate your account here