I am using CDH 5.1 and configured HDFS high availability already. I would like to have HA Impala.
I thought of the following components:
Impala Daemon: No problem (running multiple instances already)
Impala Statestore: No problem (according to doc)
Impala Catalogd: Can't find info on effect of losing this... Any idea?
Hive Metastore: No Documentation about HA in 5.1 (but exists in 5.2) . So I guess this is a single point of failure? I am allowed to create multiple metastore servers in CM 5.1. What can this archieve?
Hive Metastore Database: I understand this should be configured in PostgreSQL level... Do I need to stop metastore beforehand if I restart PostgreSQL? And does Metastore support PostgreSQL DB behind HAProxy?