Support Questions

Find answers, ask questions, and share your expertise

will data auto move to new data node?

avatar
Contributor

Hi,

i added 2 new data node to hdfs cluster and in the same rack, will data auto move to new data node?

my old data node space are 10T per node, new data node space are 20T per node, when old data node space full, new data node only 50% usage right? or will hdfs put more data to large data node? or i need manually excute hdfs balancer when old data node full?

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Sen Ke

HDFS provides a “balancer” utility to help balance the blocks across DataNodes in the cluster.

You can run "HDFS Rebalancer" from Ambari UI as described in https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-operations/content/rebalancing_hd...

To initiate a balancing process, follow these steps:

  1. In Ambari Web, browse to Services > HDFS > Summary.
  2. Click Service Actions, and then click Rebalance HDFS.
  3. Enter the Balance Threshold value as a percentage of disk capacity.
  4. Click Start.

You can monitor or cancel a rebalance process by opening the Background Operations window in Ambari.

.

Reference Article:

https://community.hortonworks.com/articles/87440/hdfs-balancer-balancing-data-between-disks-on-a-da....

View solution in original post

4 REPLIES 4

avatar
Master Mentor

@Sen Ke

HDFS provides a “balancer” utility to help balance the blocks across DataNodes in the cluster.

You can run "HDFS Rebalancer" from Ambari UI as described in https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-operations/content/rebalancing_hd...

To initiate a balancing process, follow these steps:

  1. In Ambari Web, browse to Services > HDFS > Summary.
  2. Click Service Actions, and then click Rebalance HDFS.
  3. Enter the Balance Threshold value as a percentage of disk capacity.
  4. Click Start.

You can monitor or cancel a rebalance process by opening the Background Operations window in Ambari.

.

Reference Article:

https://community.hortonworks.com/articles/87440/hdfs-balancer-balancing-data-between-disks-on-a-da....

avatar
Contributor

@Jay Kumar SenSharma

Thanks for reply, so it only rebalance manually right?

avatar
Master Mentor

@Sen Ke

Rebelancing is not something which we want to schedule on hourly or daily basis hence currently there is no out of the box option to schedule it.

Ambari simply makes it easy by providing any option to run whenever it is needed.

However if you want to schedule it then you can try using Cron Jobs to do so.

avatar
Contributor

@Jay Kumar SenSharma

i'm tried rebalance hdfs, but failed and show

0 moved / 0 left / 0 being processed,

stderr: /var/lib/ambari-agent/data/errors-1216.txt (can not find this log)

stdout: /var/lib/ambari-agent/data/output-1216.txt (can not find this log)

looks no work, if need any action before rebalance?