Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Collocating roles on 3 VM's?

avatar
Contributor

Hi, does anyone see a problem with collocating the following roles (assuming sufficient RAM on the VM)?

 

Server A

Hive Gateway

Hive Metastore

HiveServer2

Oozie

Sqoop 2

 

Server B

Hive Gateway

Hive Metastore

HiveServer2

Impala Catalog

Impala StateStore

(not sure if I need Hive running on both?)

 

Server C

Hue

All 5 Cloudera Management Services

 

I have a separate MySQL VM hosting the "amon", "hive", and "impala_catalog" databases.  The above does not include the 5 DataNodes and 2 NameNodes which are separate physical machines (also containing ZK, JournalNode, Impala Daemon, YARN, roles).  Suggestions?  Thank you .

 

Mike

1 ACCEPTED SOLUTION

avatar
Hi Mike,

You should only have 1 instance of the Hive Metastore
Hive Gateway usually should be on all hosts (otherwise /etc/hive/conf won't be populated). Note that this is not a daemon role, it's just a marker for where client configuration should go (client config goes to a host with any role from a service).
You normally only need 1 instance of HiveServer2. Multiple can be helpful for load balancing, but I doubt that's relevant with such a small cluster.
It's good to co-locate the Impala Catalog and Statestore with the Hive Metastore. For similar reasons, it's good to have HS2 on the same host as the metastore, though less important.

Thanks,
Darren

View solution in original post

4 REPLIES 4

avatar
Hi Mike,

You should only have 1 instance of the Hive Metastore
Hive Gateway usually should be on all hosts (otherwise /etc/hive/conf won't be populated). Note that this is not a daemon role, it's just a marker for where client configuration should go (client config goes to a host with any role from a service).
You normally only need 1 instance of HiveServer2. Multiple can be helpful for load balancing, but I doubt that's relevant with such a small cluster.
It's good to co-locate the Impala Catalog and Statestore with the Hive Metastore. For similar reasons, it's good to have HS2 on the same host as the metastore, though less important.

Thanks,
Darren

avatar
Contributor

Thanks, that is helpful.  How can I remove the Hive Metastore and HS2 role from a host?  I only see "Add role" and Stop in the CM.

 

----- Updated ---

Never mind, I found it

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.1/Cloudera-Manager-Enterpr...

 

avatar
Contributor

So the HIve Gateway should be on all hosts including the Name and Data Nodes?  Thanks

avatar
If you want /etc/hive/conf to be populated, then place Hive Gateways on those hosts. It's common to place it on all hosts so "hive" works on all hosts from command line.