Support Questions
Find answers, ask questions, and share your expertise

what happens when a failed datanode gets up again (in terms of replication)

for example if one of the datanode gets down and namenode didnt receive heartbeat untill 10 mins so it will copy all the data that datanode was containing to another datanode in order to maintain replication factor but suppose after 10 mins datanode gets up in that case the block will be over replicated ., so how will it maintain the replication factor?? Will it detect and delete that data automatically??

1 ACCEPTED SOLUTION

Accepted Solutions

Super Mentor

@manpreet kaur

The Over-replicated blocks are randomly removed from different nodes by the HDFS, and are rebalanced.
But if due to some reasons the HDFS rebalancing is not happening automatically then you can do it via command line or using Ambari UI

Ambari UI -> HDFS --> Service Actions --> Rebalance HDFS

It is recommended to run the HDFS Balancer periodically during times when the cluster load is expected to be lower than usual.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...

.

The following command would show over replicated blocks.

# su - hdfs
# hdfs fsck /

.

View solution in original post

2 REPLIES 2

Super Mentor

@manpreet kaur

The Over-replicated blocks are randomly removed from different nodes by the HDFS, and are rebalanced.
But if due to some reasons the HDFS rebalancing is not happening automatically then you can do it via command line or using Ambari UI

Ambari UI -> HDFS --> Service Actions --> Rebalance HDFS

It is recommended to run the HDFS Balancer periodically during times when the cluster load is expected to be lower than usual.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...

.

The following command would show over replicated blocks.

# su - hdfs
# hdfs fsck /

.

View solution in original post

Thanks Jay 🙂