Posts: 46
Registered: ‎11-03-2014
Accepted Solution

Concern about Replication when Scheduled NameNode Maintenance

CentOS 6.6
CDH 5.1.2

I would like to take down a DataNode temporary (say, for 24 hours). Some questions:


  1. For normally replicated blocks (target replication factor=3), can I disable HDFS to automatically re-replicate those blocks? 
  2. For un-replicated blocks (replication factor=1), can I do anything to pre-relocate those blocks in case they are in the DataNode to be taken down?

Understand I risk data loss. But those were not critical data anyway.



Posts: 1,892
Kudos: 432
Solutions: 302
Registered: ‎07-31-2013

Re: Concern about Replication when Scheduled NameNode Maintenance

For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.