Posts: 46
Registered: ‎11-03-2014

Impala High Availability?


I am using CDH 5.1 and configured HDFS high availability already. I would like to have HA Impala.


I thought of the following components:

  • Impala Daemon: No problem (running multiple instances already)
  • Impala Statestore: No problem (according to doc)
  • Impala Catalogd: Can't find info on effect of losing this... Any idea?
  • Hive Metastore: No Documentation about HA in 5.1 (but exists in 5.2) . So I guess this is a single point of failure?
    I am allowed to create multiple metastore servers in CM 5.1. What can this archieve?
  • Hive Metastore Database: I understand this should be configured in PostgreSQL level...
    Do I need to stop metastore beforehand if I restart PostgreSQL? And does Metastore support PostgreSQL DB behind HAProxy?


Any other considerations?