Reply
Highlighted
Contributor
Posts: 46
Registered: ‎11-03-2014
Accepted Solution

Concern about Replication when Scheduled NameNode Maintenance

CentOS 6.6
CDH 5.1.2


I would like to take down a DataNode temporary (say, for 24 hours). Some questions:

 

  1. For normally replicated blocks (target replication factor=3), can I disable HDFS to automatically re-replicate those blocks? 
  2. For un-replicated blocks (replication factor=1), can I do anything to pre-relocate those blocks in case they are in the DataNode to be taken down?

Understand I risk data loss. But those were not critical data anyway.

 

Thanks.

Posts: 1,892
Kudos: 432
Solutions: 302
Registered: ‎07-31-2013

Re: Concern about Replication when Scheduled NameNode Maintenance

For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via https://issues.apache.org/jira/browse/HDFS-7877 but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.
Announcements