Reply
Highlighted
Explorer
Posts: 9
Registered: ‎10-18-2016
Accepted Solution

Balancing Blocks Between Disks on Datanode

Hi. Some of my datanodes have different disk size. For example:

 

/dev/sdc1 918G 384G 534G 42% /data/disk1
/dev/sdd1 459G 381G 78G 84% /data/disk2
/dev/sde1 459G 391G 69G 86% /data/disk3
/dev/sdf1 459G 389G 70G 85% /data/disk4

 

 

My understanding is that there is currently no functionality for balancing within a datanode, so I'd have to move data around manually. I've found this article on performing the procedure: http://www-01.ibm.com/support/docview.wss?uid=swg21702775 (Procedure 1). Has anyone actually done this (or something similar)? Can you share any issues/caveats you ran across? Is this the best way to do it? If the other 3 disks fill up, will that datanode continue to write to disk1? 

 

Thank you. 

Cloudera Employee
Posts: 22
Registered: ‎08-16-2016

Re: Balancing Blocks Between Disks on Datanode

I feel what you described has its own inherent risk.

 

Since CDH5.8.2, you can use a new HDFS feature: intra datanode balancer to do exactly what you asked for. And we have a new blog post about this feature:

http://blog.cloudera.com/blog/2016/10/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache...

 

Explorer
Posts: 9
Registered: ‎10-18-2016

Re: Balancing Blocks Between Disks on Datanode

Thank you. This is a great feature, and I appreciate the link. Unfortunately, our cluster is running 5.7.1, and given my lack of experience with CDH (I inherited this cluster) I'm loathe to upgrade it at the moment.
Expert Contributor
Posts: 90
Registered: ‎02-15-2016

Re: Balancing Blocks Between Disks on Datanode

Same issue, do we have any such thing for  version older than 5.8 . disk balancer

Cloudera Employee
Posts: 22
Registered: ‎08-16-2016

Re: Balancing Blocks Between Disks on Datanode

diskbalanacer is a new feature in CDH5.8, and by definition, a new feature will not be backported to an older minor version.
Announcements