Support Questions
Find answers, ask questions, and share your expertise

Increasing Datanode vdisks

New Contributor

We run Hadoop FS in a virtualized cluster, 50 Datanode VMs/5TB per datanode, spread across 30 or so hosts.
Previously this was the max that we could fit due to availability of compute/storage at the the time.
This has been running great, we now want to increase out storage.
Each VM has a single 5T vdisk that sits on top of a RAID60.


We've now got more storage, allowing me to increase the existing datanode disk from 5T to 8T.
- This'll give me another 150T without using any additional hardware.
We've got denser hosts coming, allowing for about 16TB per VM and about 15 VMs spread across these hosts.
- This'll give me another 240T if I'm able to utilize all of it.


The above is great, however I'm cautious due to reading about different size Hadoop FS nodes.
Will I run into issues if I have a mix of 5T and 8T nodes?


If I use the denser nodes in a similar setup, each VM on the denser hosts would then have 16T available.
Does Hadoop round-robin across nodes or partitions, could I just setup 2x8T vdisks for VMs backed by the larger disks?