Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDFS Balancing on nodes with different capacities

Highlighted

HDFS Balancing on nodes with different capacities

New Contributor

Some of our data nodes have only half the disk capacity as other nodes, but HDFS balancing seems to put the same absolute amount of data on every node, packing the nodes with lower capacities near full usage, whereas the bigger capacity nodes would have enought spare.

Is there any possibility on hdfs configuration to distribute data on the nodes on a relative capacity usage?

Thanks in advance Cheers

Martin Purkert

martin.purkert@bawagpsk.com

1 REPLY 1
Highlighted

Re: HDFS Balancing on nodes with different capacities

Hi

@Martin Purkert

You can use hdfs balancer

-exclude -f <hosts-file> | <comma-separated list of hosts>Excludes the specified datanodes from being balanced by the balancer.
-include -f <hosts-file> | <comma-separated list of hosts>Includes only the specified datanodes to be balanced by the balancer.

Hope this helps you.

Don't have an account?
Coming from Hortonworks? Activate your account here