Support Questions

Find answers, ask questions, and share your expertise

Can I still use HDFS like normal when HDFS Balancer is running?

avatar
Contributor

Just add a new node into the current cluster, and now I'm running the Balancer command to balance disk space into the new node.

But the Balancer runs for so long, we're still waiting for it to complete before doing anything. We fear that reading/writing to HDFS while the Balancer is running will cause corrupted data files.

Should we wait, or just work like normal

1 REPLY 1

avatar
Super Collaborator

Hi, @quangbilly79 

Yes, you can continue to use HDFS normally while the Balancer is running. The Balancer only moves replicated block copies between DataNodes to even out disk usage; it does not modify the actual data files. Reads and writes are fully supported in parallel with balancing, and HDFS ensures data integrity through replication and checksums. The process may add some extra network and disk load, so you might see reduced performance during heavy balancing. There is no risk of data corruption caused by the Balancer. You don’t need to wait — it’s safe to continue your normal operations.