My colleague has ran into an issue with HDFS where it is giving out incorrect capacity values. He added a new data node to the cluster with some disks which were a slightly different layout and ever since, HDFS has been reporting wrong values.
The calculation for configured capacity is wrong. It is shown wrong both in the WebGUI and via HDFS command line.
- Virtual Machine with 1 x 600 GB disk, which has both the OS and hdfs
2) NODE 4 (new node) running RHEL7.2
- Virtual Machine with 4 disks, three of which are 300GB. Each of the extra three disks are mount for HDFS only on
/data1/, /data2/, /data3/ etc. HDFS is configured to pick these drives up.
To summarize, when configured with NODE1-3, HDFS was working fine. When adding NODE4, HDFS has become messed up but keeps functioning. Not sure what will happen when the disk space limit is hit.
The calculated sizes are:
Configured Capacity: 7.36TB
DFS Used: 1.16 TB
Non DFS Used: 2.79 TB
DFS Remaining: 3.36TB
I've tried restarting services and various other things but nothing works. The JDK version is the same for both nodes. The only difference I could see what the version of the OS (which I know, should be the same).
Any thoughts and suggestions are much appreciated!