Created 11-28-2017 05:04 PM
I am installing a new cluster for research and teaching. After installing and configuring, got the following report (hdfs dfsadmin -report):
Configured Capacity: 37791439071232 (34.37 TB)
Present Capacity: 35614683604992 (32.39 TB)
DFS Remaining: 35613520156672 (32.39 TB)
DFS Used: 1163448320 (1.08 GB)
DFS Used%: 0.00%
Under replicated blocks: 36
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
-------------------------------------------------
Live datanodes (1):
Name: xxx.xxx.xxx.xx:50010 (xxx.massey.ac.nz)
Hostname: xxx.massey.ac.nz
Decommission Status : Normal
Configured Capacity: 37791439071232 (34.37 TB)
DFS Used: 1163448320 (1.08 GB)
Non DFS Used: 0 (0 B)
DFS Remaining: 35613520156672 (32.39 TB)
DFS Used%: 0.00%
DFS Remaining%: 94.24%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Tue Nov 28 15:34:44 NZDT 2017
Last Block Report: Tue Nov 28 15:16:47 NZDT 2017
However, the machine only has about 12 TB of physical disk (df -k):
/dev/sda2 990298620 790814264 149157060 85% /
/dev/sda1 523248 3700 519548 1% /boot/efi
/dev/sdb1 10544486348 2109284 10010942624 1% /hadoop
As one can see , we only have around 11TB free space, but the HDFS reports 32.39TB. I already clean up (really cleaned up, used the phyton clean up script, removed every deb package, deleted every user related to ambari/HDFS, deleted every configuration file and logs etc) and reinstalled ambari from scratch three times, and the result is the same. Stranger yet, there is another machine with the same hardware where we installed the previous version of Ambari and that works correctly.
Ambari version: 2.6.0
HDFS version: 2-6-3-0-235 (hadoop etc)
Thanks for any help or clues that may help.
Andre.
Created 12-01-2017 01:25 AM
What is the conf dfs.datanode.data.dir set to? I suspect it is set to different directories under the same physical disk.
Created 12-01-2017 01:25 AM
What is the conf dfs.datanode.data.dir set to? I suspect it is set to different directories under the same physical disk.
Created 12-05-2017 02:01 AM
Thanks for the answer Szetszwo, indeed that is the case. However, I have done that in the past with another cluster (different versions of Ambari) seemed to work fine. The reason I did that was that the nodes had 4 physical disks, and I mount them in the same directories that HDFS data goes (using a single config for HDFS for all datanodes). The server only has a single large disk, so I should have probably used a separate configuration for it.
I'll try that and report back, thank again for your help.
Andre.