It is a feature of Hadoop that it can tackle node failure. But how Namenode handles the Datanode failures?
The NameNode stores the Metadata (data about data) of all the block locations of the whole cluster. NameNode periodically receives a Heartbeat and a Block report from each of the DataNodes in the cluster. Continuous data node Heartbeat implies that the DataNode is functioning properly. Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not received a heartbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under-replicated the system begins replicating the blocks that were stored on the dead DataNode.
The NameNode orchestrates the replication of data blocks stored on the failed DataNode to another. The replication data transfer happens directly between DataNode and the data never passes through the NameNode.Once the dead data node has been replaced usually a JBOD on a low-end server (recommissioning) then the cluster admin can run the re-balancer to redistribute the data block to the newly recommissioned data node.
$ hdfs balancer --help Usage: java Balancer [-policy <policy>] the balancing policy: datanode or blockpool [-threshold <threshold>] Percentage of disk capacity [-exclude [-f <hosts-file> | comma-separated list of hosts]] [-include [-f <hosts-file> | comma-separated list of hosts]]
Setting the Proper Threshold Value for the Balancer
You can run the balancer command without any parameters, as shown here:
$ sudo –u hdfs hdfs balancer
The balancer command uses the default threshold of 10 percent. This means that the balancer will balance data by moving blocks from over-utilized to under-utilized nodes, until each DataNode’s disk usage differs by no more than plus or minus 10 percent of the average disk usage in the cluster.
Sometimes, you may wish to set the threshold to a different level for example, when free space in the cluster is getting low and you want to keep the used storage levels on the individual DataNodes within a smaller range
$ hdfs balancer –threshold 5
This process could take longer depending on the size of the data to replicate. There is a lot of material out there if you want a hands-on walk through the decommissioning and recommissioning of a data node.
Hope that explains the usage and usecase