Created on 06-01-2018 12:31 PM - edited 09-16-2022 06:17 AM
I am starting with HDP cluster. I have basic question over mysql & hive when deployed in cluster mode.
Setting the cluster through Ambari admin does also create a cluster of mysql server & hive server.
Created 06-01-2018 09:54 PM
Any updates?
If you found an answer addressed your question, please take a moment to log in and click the "accept" link on the answer.
Created on 06-01-2018 01:55 PM - edited 08-17-2019 09:34 PM
When adding hive as service via ambari you can select to add:
1. New mysql database - > With this option will install a mysql server database and configure automatically
2. Existing Mysql Database -> With this option you need to install mysql server database add user and create database hive
3. Existing PostgreSQL Database
4. Existing Oracle Database
HTH
*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.
Created 06-01-2018 02:17 PM
I guess New mysql database - > With this option will install a mysql server database and configure automatically. this option wouldn't setup mysql in cluster.
Created 06-01-2018 02:27 PM
@Pankaj Singh Yes, if you like to have mysql in cluster mode you need to perform this configuration manually.
Created on 06-01-2018 02:03 PM - edited 08-17-2019 09:34 PM
Setting the cluster through Ambari admin does also create a cluster of MySQL server & hive server. (NO)
You will need an RDBMS for storing the Hive metastore service that stores the metadata for Hive tables and partitions in a relational database
Hive is a data warehouse software built on top of Hadoop for providing data summarization, query, and analysis. It gives a SQL-like interface to query data stored in HDFS. All queries go through the Hive metastore which translates SQL access to this information using the metastore service API
When planning a robust cluster (production) you shouldn't use the derby database but one of the following Oracle,MySQL, MS SQL, MariaDB etc these databases should be setup before running ambari or during the Ambari server setup.
These components will need a Relational database Ambari, Hive, Oozie, Ranger
You can enable Hive metastore high availability (HA), so that your cluster is resilient to failures due to a metastore that becomes unavailable each being independent. see attached HiveMetaHA
Created 06-01-2018 02:16 PM
Thanks for your answer. To have the production like setup , I fi understand correctly we need to setup HIVE & MySql in cluster after the HDP cluster setup
Created 06-01-2018 02:57 PM
Not really I usually setup the MySQL databases and test connectivity before the cluster setup.
Created 06-01-2018 09:54 PM
Any updates?
If you found an answer addressed your question, please take a moment to log in and click the "accept" link on the answer.