Let's say you have two racks, one with 18 nodes and one with 6 nodes.
If most of your work is done from a specific node on rack one, then 2/3rds of the blocks in your cluster would
go to the 2nd rack.
What is the best solution for keeping these two racks balanced?
I think one option is to change the rack topology to think you only have 1 rack.
You would lose the certainty that at least 1 block would make it to the 2nd rack though.
If anyone is curious about a solution to this.
Unless you have two or more racks that have the same number of nodes, the solution is to not use a rack topology script
or bypass it by assigning all nodes to one rack.
The only downside is that you run the risk of having missing blocks/data if one of your racks were to get smashed by a truck,
vaporized, beamed into outer-space, melts, <insert catastrophic event that only affects one rack here>, etc.
2 racks uneven, rack topology should be 1 rack
2 racks even, rack topology should be 2 racks
2 racks even, 1 rack not eve, rack topology should be 2 racks with the 3rd rack split between rack 1 and rack 2.