Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Below services on what node to be installed.


While i was installing HDP 2.5.3, in the 'install services' section, i was asked to select below services on nodes in the cluster. But am not sure about below services to be installed on what node(Master, data, edge).

1. Phoenix Query Server

2. Supervisor

3. Flume

4. Accumulo TServer

5. Livy Server

6. Spark Thrift Server

Could any one clarify, the services to be installed on what type of node and on how nodes does the service needs to be running.


Super Collaborator

The answer here depends heavily on what services you need, what hardware is available, and how frequently you will use them.

Flume Agents are minimal and mostly collect logs.

Livy is just a web API for Spark, but it does maintain SparkContexts and starts with 2GB heap by default.

Supervisor is a Storm process. (I don't know much about Storm)

Spark, Phoenix Query, and Accumulo Thrift Servers should ideally be separated for the respective query processing. Install multiple of each to provide failover.

If you are limited by servers, then use your best judgement about what is the most critical piece of your architecture, then set explicitly dedicated hardware pools to that. For the rest, as long as you have the available cpu/memory/disk to run additional processing with little overheard, then you can combine services with minimal impact.