- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Rebalancing differently sized nodes
- Labels:
-
HDFS
Created ‎02-02-2018 02:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would like to clarify my understanding of how rebalancing works.
I have cluster composed of two node types. First half of data nodes has twice as much of disk capacity as the second half. At the moment data are distributed quite uniformly across the cluster with respect to data volume. This causes nodes with less disk space to run over 85% of the disk usage while the rest of larger nodes are at about 50% of the disk usage.
Do I undestand correctly that, when I turn on rebalacing and set the HDFS Rebalancing Threshold to 10.0 (10%) the cluster will rebalance with respect to relative disk usage on each data node and the rebalancing will result in something like 65% disk usage on all nodes regardless physical disk capacity of each node?
Created ‎02-02-2018 02:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
percentage per node rather than by average byte count.
Created ‎02-02-2018 02:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
percentage per node rather than by average byte count.
