Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Concern about Replication when Scheduled NameNode Maintenance

SOLVED Go to solution

Concern about Replication when Scheduled NameNode Maintenance

Rising Star

CentOS 6.6
CDH 5.1.2


I would like to take down a DataNode temporary (say, for 24 hours). Some questions:

 

  1. For normally replicated blocks (target replication factor=3), can I disable HDFS to automatically re-replicate those blocks? 
  2. For un-replicated blocks (replication factor=1), can I do anything to pre-relocate those blocks in case they are in the DataNode to be taken down?

Understand I risk data loss. But those were not critical data anyway.

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Concern about Replication when Scheduled NameNode Maintenance

Master Guru
For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via https://issues.apache.org/jira/browse/HDFS-7877 but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.
1 REPLY 1
Highlighted

Re: Concern about Replication when Scheduled NameNode Maintenance

Master Guru
For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via https://issues.apache.org/jira/browse/HDFS-7877 but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.