Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

RACK Awareness: How is performance impacted by uneven distribution of nodes across racks ?

RACK Awareness: How is performance impacted by uneven distribution of nodes across racks ?

New Contributor

I have 11 servers with the following rack setup:

Rack 1 => node1, node2, node3

Rack 2 => node4, node5, node6

Rack 3 => node7, node8

Rack 4 => node9, node10

Rack 5 => node11

  1. How will this uneven distribution impact performance of HDFS, HBase, Kafka, Mapreduce & Yarn ?
  2. Are there any potential issues with just one server in rack5 ?

Thank you.

3 REPLIES 3
Highlighted

Re: RACK Awareness: How is performance impacted by uneven distribution of nodes across racks ?

Mentor

@Kashyap Magadi
Hadoop components are rack-aware. HDFS block placement will use rack awareness for fault tolerance by placing one block replica on a different rack. This provides data availability in the event of a network switch failure or partition within the cluster.Each rack is its own layer 3 network with a /24 subnet, which could be typical where each rack has its own switch with uplinks to a central core router.

You should and this is important work with your network team to have redundancy at the network level within the datacenter

Example below,

	  	  +-----------+ 
		    |core router| 
		    +-----------+ 
		 / 	          \ 
+-----------+ 			+-----------+ 
|rack switch| 			|rack switch| 
+-----------+ 			+-----------+ 
| data node | 			| data node | 
+-----------+ 			+-----------+ 
| data node | 			| data node | 
+-----------+ 			+-----------+ 

In your setup did you thoroughly think through of the components to map to racks especially NN, DN, RM, ZK, Kafka cluster etc the routers should also be redundant.

You will need to sit down and draw a topology of your cluster and map them to racks (see attached rack.jpg)

Hope that helps


rack.jpg
Highlighted

Re: RACK Awareness: How is performance impacted by uneven distribution of nodes across racks ?

New Contributor
 

Thank you for replying.

Below is the physical rack layout. I am trying to determine if the below architecture will have any potential performance. If so, then I plan to reach out to the network/data center team to re-rack the servers in order to minimize performance issues.

70401-rack-layout.jpg

Highlighted

Re: RACK Awareness: How is performance impacted by uneven distribution of nodes across racks ?

Mentor

@Kashyap Magadi

For example the NN the Active and standby should on 2 different racks with a redundant route (Core switch) same for the RM,ZK,DN and Kafka nodes the Hadoop network setup goes much deeper than that, it's the core of a successful Hadoop HA setup. Some of the things to consider are:

- Redundancy at ISP level
- Rack level
- Core Switch
- Router 

You will need to provide your logical requirements to the data center network team and they should be able to replicate that physically.

One condition is if one component of these components NN, DN, RM, ZK, Kafka cluster fails the cluster should still function normally and that depends on the software(hadoop) and hardware (network) redundancy setup. I would advise you to meetup with the network team

Don't have an account?
Coming from Hortonworks? Activate your account here