Support Questions
Find answers, ask questions, and share your expertise

Datanode recommissioning

New Contributor

Do I need to delete all data from a datanode before recommissioning it, or it doesn't matter and the namenode will not pick stale data from the datanode?


@guido I think so. Check the doc if that helps:

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

New Contributor

Thanks for your answer.

I cannot see anything on documentations that clarify if I need to delete all data.

And I'm not using Cloudera Manager but Ambari (it's Hortonworks 2.6.5 hadoop 2.7.3).

For what I understand deleting all data would be better to balance in node disks (I had to change a faulty disk). As I'm stuck on hadoop 2.7.3 there is no internal balance facility.

Cloudera Employee

Hi @JGUI , There is no requirement for deleting the data from the datanode that is going to be decommissioned. Once the DN is been decommissioned all the blocks in the DN would be replicated to a different DN.


And is there any error that you are encountering while you are decommissioning ?

Typically, HDFS would self-heal and would re-replicate the under-replicated blocks that are due to the DN that is been decommissioned. And NN would start replicating the blocks with the other two replication that is present in HDFS.

; ;