Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Concern about Replication when Scheduled NameNode Maintenance

avatar
Expert Contributor

CentOS 6.6
CDH 5.1.2


I would like to take down a DataNode temporary (say, for 24 hours). Some questions:

 

  1. For normally replicated blocks (target replication factor=3), can I disable HDFS to automatically re-replicate those blocks? 
  2. For un-replicated blocks (replication factor=1), can I do anything to pre-relocate those blocks in case they are in the DataNode to be taken down?

Understand I risk data loss. But those were not critical data anyway.

 

Thanks.

1 ACCEPTED SOLUTION

avatar
Mentor
For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via https://issues.apache.org/jira/browse/HDFS-7877 but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.

View solution in original post

1 REPLY 1

avatar
Mentor
For (1), the answer right now is no. Once the dead node detection occurs, NameNode will swiftly act at re-replicating the identified lost replicas. There's something along the lines of what you need being worked upon upstream via https://issues.apache.org/jira/browse/HDFS-7877 but the work is still in progress and will only arrive in a future undetermined CDH release.

For (2), you can hunt such files with replication factor of 1 and raise them to 2 and wait for under-replication count to reach 0 before you take the DN down. The change of replication factor is doable by the command 'hadoop fs -setrep'.