When installing HDP on VMWare or other virtual environment, how important is it, or is it at all, to spread the nodes across multiple hosts? I know from a reliability and HA perspective, it would be very important but what about performance? Let's say I have twenty nodes across four VMWare hosts and need to install a few more nodes for some peak in data handling. Since I already have multiple hosts handling my nodes, what's the performance difference in adding new nodes to the existing hosts?
My thought is that I can't control what the virtualization team might put on those hosts if I don't max them out so I don't gain any performance reliability by adding more to the same hosts or spreading them out. Does it matter, performance wise?