Support Questions
Find answers, ask questions, and share your expertise

Best Practices Rack Awareness with HDFS in AWS




According with the best practices of Cloudera to implement CDH on AWS, I have some questions.


We have a cluster with  replica factor of 3. That means if one rack is down, my data would be available in another rack.


We have configured rack awareness based on the availability  zone where this instance is located. For example.


instance a (east-1a) - > rack /a

instance b (east-1b) - > rack /b

instance c (east-1c) - > rack /c


In that case, we can't use placement groups, as placement group should have all the instances under the same availability zone.


So my question is, What is the best practice for this? If we use placement groups we can't use rack awareness.




Best practice is to not spread across AZ. This is not a support configuration and will cause issues. I was just migrating between AZ's and had issues trying to check by having some DNs in a different AZ and letting it replicate data over that way. This did not work and many timeouts happened. I could've tried tweaking the configs to get it to work but it wasn't worth it and having such high values would not be advisable in production.

With that all said, I would have everything in 1 AZ in a PG and randomly assign racks to provide boundaries for replication. I say randomly as you don't know physically which nodes are closer, but so you don't benefit from reduce network traffic and reducing congestion but it will ensure that the second and third are assigned to rack other than the first replica was written. In all honesty, though, it matters little as even with all nodes in a single rack it will still assign the replicas to other nodes. The risk of AWS having a rack failure and it taking out the 3 nodes with all replicas for a single block is present in either scenario.

Note: You may not be able to add additional nodes to a PG after it has been set up. It depends on AZ node availability for that instance type and AWS's black magik to be able to move instances around to get them in. I have had issues with not being able to add more nodes to an existing PG. You will likely have to do a full migration to grow the cluster (or be good friends with AWS).

New Contributor

EC2 recently introduced Partition Placement Groups for rack-aware applications -

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.