I have a 6 datanode ( + 2 NN ) cluster running CDH 5.9.
The cluster datanode specs are as follows
4 DN - 180 GB RAM, 30TB Disk Space Each
2 DN - 55 GB RAM, 15TB Disk Space Each
4 nodes already existed and we added the other 2 later. While adding we decided to keep the same config for the roles as we did for the older 4. This obviously overcommitted memory. Overtime we saw DFS disk space on 2 nodes being used more ( 80% ) as compared to ( 65% ) on the 4 nodes.
So i want to understand if this is due to overcommit which is causing these two DN to be used more or is there anything else I need to tune.