Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Will the data be replicated, once balance data is done after adding new data node to the existing cluster.

avatar
New Member

@Jay Kumar SenSharma kindly help.

I had removed one data node from the cluster of five node.

Now i am adding the new data node to the cluster. If i do balance data will the data will be replicated (Existing replication factor is 3)to my newly added Data Node.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@kotesh banoth

New data will be replicated to the newly added data node. But if you want to rebalance your cluster then it is best if you run the "HDFS Rebalancer" from ambari UI or via command line.

HDFS provides a “balancer” utility to help balance the blocks across DataNodes in the cluster.

Ambari UI --> HDFS --> Service Actions --> HDFS Rebalance<br>

(OR)

# su - hdfs -c "hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10"

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...

View solution in original post

1 REPLY 1

avatar
Master Mentor

@kotesh banoth

New data will be replicated to the newly added data node. But if you want to rebalance your cluster then it is best if you run the "HDFS Rebalancer" from ambari UI or via command line.

HDFS provides a “balancer” utility to help balance the blocks across DataNodes in the cluster.

Ambari UI --> HDFS --> Service Actions --> HDFS Rebalance<br>

(OR)

# su - hdfs -c "hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10"

https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-operations/content/rebalancing_hd...