Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

SOLVED Go to solution
Highlighted

Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

Are there HDP applications where latency between availability zones (AZ) (approx. 1 ms) is significant? It seems like rack awareness could be used, treating each AZ as a different rack.

  • Is this the common way to handle this in practice?
  • Does anyone have examples of SLAs for clusters with and without multiple AZs?
  • Anything else to be aware of regarding EC2 AZs (or the equivalents at other cloud providers)?
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

Guru

Alex, I would not recommend customers deploy clusters across availability zones, while it is technically feasible to use rack awareness to segregate racks per AZ, I haven't seen us recommend this in the past, and other distribution providers even go as far to say it is not supported (multi-AZ deployment).

5 REPLIES 5

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

@Alex Miller I doubt that you will find the exact answer of this. This is good starting point and based on your use case , you can gather more data.

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

Guru

Alex, I would not recommend customers deploy clusters across availability zones, while it is technically feasible to use rack awareness to segregate racks per AZ, I haven't seen us recommend this in the past, and other distribution providers even go as far to say it is not supported (multi-AZ deployment).

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

Ok, across AWS Regions I understand, but it seems like AZs should have minimal performance impacts (latency isn't much higher) and would provide redundancy for HA.

Either way, I'm glad to hear feedback from what is seen in the field and from other providers.

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

New Contributor

Greetings @Paul Codding, it has been a few years since activity on this thread and our team is wondering if it is still the case that Hortonworks does not recommend spanning multiple availability zones to implement Hadoop high availability in AWS?

In a recent post on the subject @fschneider replied "that in case of HA clusters the HA nodes should be launched in different availabilty zones". https://community.hortonworks.com/questions/176198/will-single-availability-zone-provide-high-availa...

Other vendors are recommending a deployment methodology that spans AWS availability zones while also noting data transfer costs, network latency and throughput considerations.

Many thanks in advance!

Re: Best practices for spanning AWS availability zones (or equivalent at other cloud providers)

New Contributor

Amazon EC2 recently introduced Partition Placement Groups for rack-aware applications -

https://aws.amazon.com/blogs/compute/using-partition-placement-groups-for-large-distributed-and-repl...