Created 12-16-2015 06:25 PM
Let's say if I have 20+ nodes to deploy "SLAVE" components for the services HDFS, HIVE, HBASE and OOZIE, what is the general guidance on SLAVE component placement? Should all node contain all SLAVE components or is it better to isolate the SLAVE components? If the answer is latter, are there any guidance on doing this?
Created 12-16-2015 06:29 PM
@Vijay Srinivasaraghavan First of all Oozie doesn't have any slave components, it has one master component Oozie Server which should be placed on one of the master nodes and then clients which should be placed on edge/client node. And thats the similar case with Hive which doesn't have any slave components.
As a general recommendation, you can start with having HBase Region Server, HDFS Datanode and YARN NodeManager on all slave nodes but usually over time once you know and understand your workload, use cases and compute requirements, this would evolve.
Created 12-16-2015 06:29 PM
@Vijay Srinivasaraghavan First of all Oozie doesn't have any slave components, it has one master component Oozie Server which should be placed on one of the master nodes and then clients which should be placed on edge/client node. And thats the similar case with Hive which doesn't have any slave components.
As a general recommendation, you can start with having HBase Region Server, HDFS Datanode and YARN NodeManager on all slave nodes but usually over time once you know and understand your workload, use cases and compute requirements, this would evolve.
Created 12-17-2015 03:51 PM
There are only few SLAVE components for a stack (DN, NM, ACCUMULO_TSERVER, FLUME_HANDLER, METRICS_MONITOR, HBASE_REGIONSERVER, PQS, SUPERVISOR). I am trying to understand if some of the components can be grouped seperately to balance the load (for e.g., CPU centric components can be isolated). What is the general guidance, deployment practice used in the field?