I am managing a CDH 5.13 cluster with 4 datanodes. Each datanode had 10 x 2.7 TB disks (~90% used) and we just added another 8 x 3.6 TB disks on each node.
I did a "Rebalance" on HDFS service, which apparently did nothing as all nodes have the same disk usage (in total).
Now, I followed this post in order to intra-node-balance the disks (with threshold set to 25). After 1 hour of execution, the progress is terribly slow (as you can see in the last /data/18 node which gets data):
1. Currently, there are not pipelines accessing HDFS, but tomorrow morning there will be, and it's obvious from the progress that disk balancing won't have finished. Is it safe to leave this process to finish, while having the cluster in production?
2. Is there something I can do to speed things up?
3. How can I terminate this process, safely, if required?