Created 01-12-2021 01:57 AM
We want to deploy HDP 3.1.5 in production environment
We have 3 server for masternode and 6 server for workernode
And we have plan component layout across 9 nodes above but we want to make sure where we need to place the service-client below
1. Yarn clients
2. Mapreduce2 clients
3. Hive clients
4. infra solr clients
5. Kerberos clients
6. Oozie clients
7. Pig Clients
8. Spark2 clients
9. Sqoop clients
10. Tez client
Created 01-12-2021 11:58 AM
In a Hadoop cluster, three types of nodes exist Master, Worker and edge nodes. The distinction of roles helps maintain efficiency.
Master nodes control which nodes perform which tasks and what processes run on what nodes. The majority of work is assigned to worker nodes. Worker node store most of the data and perform most of the calculations Edge nodes aka gateway facilitate communications from end users to master and worker nodes.
The 3 masternodes should have the Namenode[Active & Standby],YARN [Active & Standby], Zookeeper Quorum [3 masters] and the other component you intend to install and on the 6 worker node aka slave nodes you will install the Nodemanager,Datanodes and the all the clients.
There is no need to install the client on the master nodes,
Some nodes have important tasks, which may impact performance if interrupted. Edge nodes allow end-users to contact worker nodes when necessary, providing a network interface for the cluster without leaving the entire cluster open to communication. That limitation improves reliability and security. As work is evenly distributed between work nodes, the edge node’s role helps avoid data skewing and performance issues.
See my document on edge node https://community.cloudera.com/t5/Support-Questions/Edge-node-or-utility-node-packages/td-p/202164#
Hope that helps
Created 01-12-2021 11:58 AM
In a Hadoop cluster, three types of nodes exist Master, Worker and edge nodes. The distinction of roles helps maintain efficiency.
Master nodes control which nodes perform which tasks and what processes run on what nodes. The majority of work is assigned to worker nodes. Worker node store most of the data and perform most of the calculations Edge nodes aka gateway facilitate communications from end users to master and worker nodes.
The 3 masternodes should have the Namenode[Active & Standby],YARN [Active & Standby], Zookeeper Quorum [3 masters] and the other component you intend to install and on the 6 worker node aka slave nodes you will install the Nodemanager,Datanodes and the all the clients.
There is no need to install the client on the master nodes,
Some nodes have important tasks, which may impact performance if interrupted. Edge nodes allow end-users to contact worker nodes when necessary, providing a network interface for the cluster without leaving the entire cluster open to communication. That limitation improves reliability and security. As work is evenly distributed between work nodes, the edge node’s role helps avoid data skewing and performance issues.
See my document on edge node https://community.cloudera.com/t5/Support-Questions/Edge-node-or-utility-node-packages/td-p/202164#
Hope that helps
Created 01-12-2021 08:46 PM
If we want to limit interaction of hdp/hadoop developers/data analyst or scientist, does it mean we don't need to install client in all workernodes?
And we have ever found that for special case, sqoop and oozie client, are needed to be installed in all nodes include master-worker nodes, Is it related to how sqoop and oozie works?