Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

hdfs rebalncing

hdfs rebalncing



is there anyway to run the balancer automatically everyday.


Re: hdfs rebalncing

Master Guru
At present there is no inbuilt way to do this.

You can however, regularly schedule a balancer command as a Linux CRON
job, to run daily at a preferred time.
Or, if you use CM, you can trigger its balancer actions via the REST
API documented at


Re: hdfs rebalncing

here is my scenario:


i have three 3 hadoop boxes and i have no replication,


when the data is comming to hdfs tha toatal datat is stored into box1 event htough theres is a space in 2 and 3 boxes.


please help me that is this is the issue causing by replication?


By default the frame work will store the data in round robin mannaer,right?




Re: hdfs rebalncing

Master Guru
The default policy of a HDFS client is to store its first block
replica on the same node as the client, if the client host is also
serving a DataNode. This is for optimisation of writes.

If you want a more randomized write, write from outside of the cluster
nodes (i.e. from a host that is not a DataNode).

Don't have an account?
Coming from Hortonworks? Activate your account here