Support Questions
Find answers, ask questions, and share your expertise

HDP services High Availability

Hi,

I understand some of the servcies can be setup in HA mode as documented in the docs. However, I am trying to understand what does "High Availability" mean for the following HDP services / components.

  • Tez
  • Spark ((Presume its a client-only and hence HA won't be applicable as multiple clients can be installed)
  • Slider
  • Phoenix ((Presume its a client-only and hence HA won't be applicable as multiple clients can be installed)
  • Accumulo
  • Storm (Is it all about setting Nimbus HA?)
  • Falcon
  • Atlas
  • Sqoop (Presume its a client-only and hence HA won't be applicable as multiple clients can be installed. But wondering the role of the database behind Sqoop)
  • Flume
  • Ambari (Presume no native HA available at the moment, but planned for future)
  • Zookeeper (Presume Zookeeper itself is inherently HA due to its ensemble and thats what provides HA to many other components. But wanted to understand it there is more to this.)
  • Knox
1 ACCEPTED SOLUTION

Accepted Solutions

  • Tez and Slider are also client-only, so HA is not applicable
  • Phoenix: Depends on HBase and ZooKeeper
  • Accumulo: Multiple Accumulo masters can be run, one of them will be active, the rest backup ones
  • Storm: Multiple Nimbus instances supported, automatic failover
  • Falcon: HA is available, but the failover is a manual process, details here
  • Atlas: A backup instance can be run, but the failover is manual (like Falcon). Automated failover expected in version 0.7
  • Sqoop: metastore backup is usually enough
  • Flume: You can run Flume agents behind a load balancer, more details here
  • Zookeeper: Inherently HA if you run 3 or more instances, furthermore ensure ZK stores data on Raid-10 disks
  • Knox: Multiple instances can be configured behind an LB, more for load balancing but also for HA

View solution in original post

3 REPLIES 3

  • Tez and Slider are also client-only, so HA is not applicable
  • Phoenix: Depends on HBase and ZooKeeper
  • Accumulo: Multiple Accumulo masters can be run, one of them will be active, the rest backup ones
  • Storm: Multiple Nimbus instances supported, automatic failover
  • Falcon: HA is available, but the failover is a manual process, details here
  • Atlas: A backup instance can be run, but the failover is manual (like Falcon). Automated failover expected in version 0.7
  • Sqoop: metastore backup is usually enough
  • Flume: You can run Flume agents behind a load balancer, more details here
  • Zookeeper: Inherently HA if you run 3 or more instances, furthermore ensure ZK stores data on Raid-10 disks
  • Knox: Multiple instances can be configured behind an LB, more for load balancing but also for HA

View solution in original post

Rising Star

@Predrag Minovic - Slight update on storm, we can run multiple Nimbus servers:

https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Users_Guide/content/ch05s05.html

This is to deal with cases where Nimbus can't be automatically restarted (e.g disk failure on the node). Details of Nimbus HA is outlined here:

http://hortonworks.com/blog/fault-tolerant-nimbus-in-apache-storm/

Hi @Laurence Da Luz, thanks for the correction! I'll edit my answer.