Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

re balance the data size on data node disks

Solved Go to solution

re balance the data size on data node disks

hi all



we have production cluster with HDP - 2.6.4 version


we have 186 data-node machines ( DELL MACHINES WITH 10 disks )

we try to re balance the data on the disks so disks will be with the same used size but without success

we feel that 2.6.4 version not have the tools that support re balance!!!


as I mentioned on each data-node machine we have 10 disks while each disk is 1.8T

and some of the disks are 55% used

and some of them are only 1% used


so we have non balanced disks ( its like some disk are not useful ) , but why HDFS not balanced the data on all disks??


my question - from which HDP version , we can re balance the data-node disks ?


dose 2.6.5 version support re balance ?

or from 3.X ?


please advice , what we can do ?


as I mentioned this is very huge cluster and

we get the bad feeling that the current HDP version ( 2.6.4 ) not support any re balance - is it true?


example


/dev/sdc                 3842878616 357409860 3485452372  10% /data_hdfs/sdc
/dev/sde                 3842878616 460433776 3382428456  42% /data_hdfs/sde
/dev/sdi                 3842878616   8606628   34255604   1% /data_hdfs/sdi
/dev/sdg                 3842878616 256937520   85924712   7% /data_hdfs/sdg
/dev/sdd                 3842878616 465520852 3377341380  53% /data_hdfs/sdd
/dev/sdh                 3842878616     90136   42772096   1% /data_hdfs/sdh
/dev/sdb                 3842878616 466423860 3376438372  53% /data_hdfs/sdb
Michael-Bronson
1 ACCEPTED SOLUTION

Accepted Solutions

Re: re balance the data size on data node disks

New Contributor

Hello!

Seems the disk balancer utility was introduced after HDP 3.0.0-alpha1 , see here.

Someone was talking that technically was possible to port back to previous HDP versions, but seems there is no progress on here.

As far as I know, we have not backported this change to HDP 2.1 or 2.4.2. There is nothing technically preventing us from doing so; Disk balancer does not depend on any of the newer 3.0 features. 


In another discussion, they suggest decommissioning the done, and commissioning again. Yes, is an arduous task, but, better than nothing


Apache documentation for Disk rebalancing

2 REPLIES 2

Re: re balance the data size on data node disks

New Contributor

Hello!

Seems the disk balancer utility was introduced after HDP 3.0.0-alpha1 , see here.

Someone was talking that technically was possible to port back to previous HDP versions, but seems there is no progress on here.

As far as I know, we have not backported this change to HDP 2.1 or 2.4.2. There is nothing technically preventing us from doing so; Disk balancer does not depend on any of the newer 3.0 features. 


In another discussion, they suggest decommissioning the done, and commissioning again. Yes, is an arduous task, but, better than nothing


Apache documentation for Disk rebalancing

Re: re balance the data size on data node disks

Don't have an account?
Coming from Hortonworks? Activate your account here