Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Datanode recommissioning


Do I need to delete all data from a datanode before recommissioning it, or it doesn't matter and the namenode will not pick stale data from the datanode?


Master Guru

@guido I think so. Check the doc if that helps:

Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.


Thanks for your answer.

I cannot see anything on documentations that clarify if I need to delete all data.

And I'm not using Cloudera Manager but Ambari (it's Hortonworks 2.6.5 hadoop 2.7.3).

For what I understand deleting all data would be better to balance in node disks (I had to change a faulty disk). As I'm stuck on hadoop 2.7.3 there is no internal balance facility.


Hi @JGUI , There is no requirement for deleting the data from the datanode that is going to be decommissioned. Once the DN is been decommissioned all the blocks in the DN would be replicated to a different DN.


And is there any error that you are encountering while you are decommissioning ?

Typically, HDFS would self-heal and would re-replicate the under-replicated blocks that are due to the DN that is been decommissioned. And NN would start replicating the blocks with the other two replication that is present in HDFS.