Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

Collocating roles on 3 VM's?

Contributor

Hi, does anyone see a problem with collocating the following roles (assuming sufficient RAM on the VM)?

 

Server A

Hive Gateway

Hive Metastore

HiveServer2

Oozie

Sqoop 2

 

Server B

Hive Gateway

Hive Metastore

HiveServer2

Impala Catalog

Impala StateStore

(not sure if I need Hive running on both?)

 

Server C

Hue

All 5 Cloudera Management Services

 

I have a separate MySQL VM hosting the "amon", "hive", and "impala_catalog" databases.  The above does not include the 5 DataNodes and 2 NameNodes which are separate physical machines (also containing ZK, JournalNode, Impala Daemon, YARN, roles).  Suggestions?  Thank you .

 

Mike

1 ACCEPTED SOLUTION

Hi Mike,

You should only have 1 instance of the Hive Metastore
Hive Gateway usually should be on all hosts (otherwise /etc/hive/conf won't be populated). Note that this is not a daemon role, it's just a marker for where client configuration should go (client config goes to a host with any role from a service).
You normally only need 1 instance of HiveServer2. Multiple can be helpful for load balancing, but I doubt that's relevant with such a small cluster.
It's good to co-locate the Impala Catalog and Statestore with the Hive Metastore. For similar reasons, it's good to have HS2 on the same host as the metastore, though less important.

Thanks,
Darren

View solution in original post

4 REPLIES 4

Hi Mike,

You should only have 1 instance of the Hive Metastore
Hive Gateway usually should be on all hosts (otherwise /etc/hive/conf won't be populated). Note that this is not a daemon role, it's just a marker for where client configuration should go (client config goes to a host with any role from a service).
You normally only need 1 instance of HiveServer2. Multiple can be helpful for load balancing, but I doubt that's relevant with such a small cluster.
It's good to co-locate the Impala Catalog and Statestore with the Hive Metastore. For similar reasons, it's good to have HS2 on the same host as the metastore, though less important.

Thanks,
Darren

Contributor

Thanks, that is helpful.  How can I remove the Hive Metastore and HS2 role from a host?  I only see "Add role" and Stop in the CM.

 

----- Updated ---

Never mind, I found it

http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/4.5.1/Cloudera-Manager-Enterpr...

 

Contributor

So the HIve Gateway should be on all hosts including the Name and Data Nodes?  Thanks

If you want /etc/hive/conf to be populated, then place Hive Gateways on those hosts. It's common to place it on all hosts so "hive" works on all hosts from command line.
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.