Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDP cluster for mysql server & hive server

avatar

I am starting with HDP cluster. I have basic question over mysql & hive when deployed in cluster mode.

Setting the cluster through Ambari admin does also create a cluster of mysql server & hive server.

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Pankaj Singh

Any updates?

If you found an answer addressed your question, please take a moment to log in and click the "accept" link on the answer.

View solution in original post

7 REPLIES 7

avatar

@Pankaj Singh

When adding hive as service via ambari you can select to add:

1. New mysql database - > With this option will install a mysql server database and configure automatically

2. Existing Mysql Database -> With this option you need to install mysql server database add user and create database hive

3. Existing PostgreSQL Database

4. Existing Oracle Database

76553-hive-jdbc.png

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

avatar

I guess New mysql database - > With this option will install a mysql server database and configure automatically. this option wouldn't setup mysql in cluster.

avatar

@Pankaj Singh Yes, if you like to have mysql in cluster mode you need to perform this configuration manually.

avatar
Master Mentor

@Pankaj Singh

Setting the cluster through Ambari admin does also create a cluster of MySQL server & hive server. (NO)

You will need an RDBMS for storing the Hive metastore service that stores the metadata for Hive tables and partitions in a relational database

Hive is a data warehouse software built on top of Hadoop for providing data summarization, query, and analysis. It gives a SQL-like interface to query data stored in HDFS. All queries go through the Hive metastore which translates SQL access to this information using the metastore service API

When planning a robust cluster (production) you shouldn't use the derby database but one of the following Oracle,MySQL, MS SQL, MariaDB etc these databases should be setup before running ambari or during the Ambari server setup.

These components will need a Relational database Ambari, Hive, Oozie, Ranger

You can enable Hive metastore high availability (HA), so that your cluster is resilient to failures due to a metastore that becomes unavailable each being independent. see attached HiveMetaHA

76554-hivemeta-ha.jpg

Steps of setting up Metadata databases

avatar

Thanks for your answer. To have the production like setup , I fi understand correctly we need to setup HIVE & MySql in cluster after the HDP cluster setup

avatar
Master Mentor

@Pankaj Singh

Not really I usually setup the MySQL databases and test connectivity before the cluster setup.

avatar
Master Mentor

@Pankaj Singh

Any updates?

If you found an answer addressed your question, please take a moment to log in and click the "accept" link on the answer.