I have 11 servers with the following rack setup:
Rack 1 => node1, node2, node3
Rack 2 => node4, node5, node6
Rack 3 => node7, node8
Rack 4 => node9, node10
Rack 5 => node11
Hadoop components are rack-aware. HDFS block placement will use rack awareness for fault tolerance by placing one block replica on a different rack. This provides data availability in the event of a network switch failure or partition within the cluster.Each rack is its own layer 3 network with a /24 subnet, which could be typical where each rack has its own switch with uplinks to a central core router.
You should and this is important work with your network team to have redundancy at the network level within the datacenter
+-----------+ |core router| +-----------+ / \ +-----------+ +-----------+ |rack switch| |rack switch| +-----------+ +-----------+ | data node | | data node | +-----------+ +-----------+ | data node | | data node | +-----------+ +-----------+
In your setup did you thoroughly think through of the components to map to racks especially NN, DN, RM, ZK, Kafka cluster etc the routers should also be redundant.
You will need to sit down and draw a topology of your cluster and map them to racks (see attached rack.jpg)
Hope that helps
Thank you for replying.
Below is the physical rack layout. I am trying to determine if the below architecture will have any potential performance. If so, then I plan to reach out to the network/data center team to re-rack the servers in order to minimize performance issues.
For example the NN the Active and standby should on 2 different racks with a redundant route (Core switch) same for the RM,ZK,DN and Kafka nodes the Hadoop network setup goes much deeper than that, it's the core of a successful Hadoop HA setup. Some of the things to consider are:
- Redundancy at ISP level - Rack level - Core Switch - Router
You will need to provide your logical requirements to the data center network team and they should be able to replicate that physically.
One condition is if one component of these components NN, DN, RM, ZK, Kafka cluster fails the cluster should still function normally and that depends on the software(hadoop) and hardware (network) redundancy setup. I would advise you to meetup with the network team