I have a question on the HWX hardware recommendations.
There it says :
"Typically, a medium-to -large Hadoop cluster consists of a two-level or three-level architecture built with rack-mounted servers. Each rack of servers is interconnected using a 1 Gigabyte Ethernet (GbE) switch. Each rack-level switch is connected to a cluster-level switch (which is typically a larger port-density 10GbE switch). These cluster-level switches may also interconnect with other cluster-level switches or even uplink to another level of switching infrastructure."
So what suprises me is that the requirements for the inter-rack switch (1GB) are actually lower then the top of rack switches (10B). Wouldn't the inter-rack traffic be just as high, if not higher then the intra-rack traffic?
@disclaimer: I am not a network specialist
@Hellmar Becker Yeah well, I think you are right, but the text is a bit ambiguous, not very clear. So it is 10:1 but that would not scale out to 100:10 would it? Don't think there are switches like that around
@Robert K. I agree, the text is a bit obscure.
I am not a network guy myself. But we had that discussion earlier, and there are network topologies that can be implemented if the bandwidth requirement exceeds the capacity of the biggest switch you can buy. See https://en.wikipedia.org/wiki/Clos_network.
I believe the documentation is ambiguous. "Each rack of servers is interconnected using a 1 GbE switch)" is referring to the "top of rack" or "leaf" switch which is intra rack. In the next sentence, "Each rack-level switch is connected to a cluster level switch which is typically a larger port-density 10GbE switch" is referring to the "core" or "spine" switch which is inter rack.
So the intra rack switch is 1GbE and the inter rack switch is 10GbE as you expect.
For what it's worth, the cost of 10GbE has come down quite a bit. I prefer to deploy 10GbE top of rack switches with 10GbE on each server when possible connected to spine switches over 40GbE up-links.
This article is a bit older (10GbE is more cost effective now), but has good information in terms of network architecture: http://bradhedlund.com/2012/03/26/considering-10ge-hadoop-clusters-and-the-network/
Hello, the '90s are calling and they want their Ethernet back!
Honestly, in this day and age it is patently ridiculous to use only 1g Ethernet to a server. Hortonworks should be ashamed that they still have that in their document.
Even 10g is long-of-tooth - 25g is shipping and looking good; and 40G backbone links are moving to 100G.