Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

what happens when a failed datanode gets up again (in terms of replication)

avatar

for example if one of the datanode gets down and namenode didnt receive heartbeat untill 10 mins so it will copy all the data that datanode was containing to another datanode in order to maintain replication factor but suppose after 10 mins datanode gets up in that case the block will be over replicated ., so how will it maintain the replication factor?? Will it detect and delete that data automatically??

1 ACCEPTED SOLUTION

avatar
Master Mentor

@manpreet kaur

The Over-replicated blocks are randomly removed from different nodes by the HDFS, and are rebalanced.
But if due to some reasons the HDFS rebalancing is not happening automatically then you can do it via command line or using Ambari UI

Ambari UI -> HDFS --> Service Actions --> Rebalance HDFS

It is recommended to run the HDFS Balancer periodically during times when the cluster load is expected to be lower than usual.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...

.

The following command would show over replicated blocks.

# su - hdfs
# hdfs fsck /

.

View solution in original post

2 REPLIES 2

avatar
Master Mentor

@manpreet kaur

The Over-replicated blocks are randomly removed from different nodes by the HDFS, and are rebalanced.
But if due to some reasons the HDFS rebalancing is not happening automatically then you can do it via command line or using Ambari UI

Ambari UI -> HDFS --> Service Actions --> Rebalance HDFS

It is recommended to run the HDFS Balancer periodically during times when the cluster load is expected to be lower than usual.

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...

.

The following command would show over replicated blocks.

# su - hdfs
# hdfs fsck /

.

avatar

Thanks Jay 🙂