Is there a command to track progress on hdfs replication?
After changing replication factor from 1 to 3, i need to track progress on how long it is going to take. I need to either track progress or ideally - retrieve info how long did it take (post factum). This is to help optimize certain configurations so the testing/process will have to be repeated several times and perf metrics compared.
The HDFS Balancer is a tool for balancing the data across the storage devices of a HDFS cluster. The HDFS balancer moves blocks until the cluster is deemed to be balanced, which means that the utilization of every DataNode (ratio of used space on the node to total capacity of the node) differs from the utilization of the cluster (ratio of used space on the cluster to total capacity of the cluster) by no more than a given threshold percentage.
The HDFS Rebalance operation can be either triggered via Ambari UI or via Command line:
Ambari UI --> HDFS --> Actions (Drop down) --> Rebalance HDFS (Enter)
then specify the "Balancer threshold (percentage of disk capacity)"
The above question and the entire reply thread below was originally posted in the Community Help track. On Mon Jul 1 01:31 UTC 2019, a member of the HCC moderation staff moved it to the Hadoop Core track. The Community Help Track is intended for questions about using the HCC site itself, not technical questions about HDFS replication.
Bill Brooks, Community Moderator Was your question answered? Make sure to mark the answer as the accepted solution. If you find a reply useful, say thanks by clicking on the thumbs up button.