Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Rack Topology - send 3rd copy to 3rd rack

Rack Topology - send 3rd copy to 3rd rack

Explorer

Hello Team,

 

Need your help to test and setup below requirement.

 

1. I have 3 racks and datanodes distributed across 3 racks.

 

My requirement is 3rd copy of each data should go to third rack in any scenario.is there any way i can set this up?

 

I was reading many blocks but no where it confirmed that configuring rack topology will ensure that all 3 copies of data will go to 3 different racks.

 

Kindly suggest?

 

- Vijay M

3 REPLIES 3

Re: Rack Topology - send 3rd copy to 3rd rack

Expert Contributor

Hello,

 

Can you please tell us what documentation you have reviewed? Setting the rack locations of host is normally what is used to determine block placement. If HDFS for example is aware of your topology it should ensure that at least one replica is on another rack.

 

https://www.cloudera.com/documentation/enterprise/5-15-x/topics/cm_mc_specify_rack.html

https://community.hortonworks.com/articles/43057/rack-awareness-1.html

Customer Operations Engineer | Security SME | Cloudera, Inc.

Re: Rack Topology - send 3rd copy to 3rd rack

Explorer
@ihebert,

I referred cloudera documentation only.

For testing on test environment we have 4 datanodes.

Through cloudera manager I kept 1 datanode in rack 1 and rack 3, remain 2
datanodes kept in rack2.

But when I created new file in hdfs it gets write into datanodes of rack 2
and rack 3 as RF set to 3.

As per rack topology method all 3 replica should go in 3 different racks
but in my case it's not.

Kindly suggest?

Vijay M

Re: Rack Topology - send 3rd copy to 3rd rack

Expert Contributor

Hello,

 

Please review the Horton works community documentation. It covers rack awareness better than out documentation currently does and it is accurate. The behaviour you are describing is exactly how rack awareness works.

 

When HDFS is made rack aware it will place 2 replicas within the same rack and a 3rd in a remote rack. That is because local nodes within the same rack are preferable both for the HDFS framework as well as most job schedulres. With a replication factor of 3 HDFS will not place a block on every rack.

Customer Operations Engineer | Security SME | Cloudera, Inc.
Don't have an account?
Coming from Hortonworks? Activate your account here