Support Questions

Chahat_0 · ‎01-02-2021

Resource manager manages resources by communicating with the node managers .

Are these node managers same as data nodes ?

Also , does the resource manager communicate with node managers via Name Node since it has all the meta data ?

And after resources have been allocated , for a map Reduce Job , the mappers and reducers job is scheduled by Name node , right ?

I got confused when the term resource manager came after Name node , hence looking for a confirmation for the basics

Shelton · ‎01-02-2021

@Chahat_0

Hadoop is designed to ensure that compute (Node Managers) runs as close to data (Data Nodes) as possible. Usually containers for jobs are allocated on the same nodes where the data is present. Hence in a typical Hadoop cluster, both Data Nodes and Node Manager run on the same machine.

Node Manager is the RM slave process while the Data Nodes is the Namenode slave process which responsible for coordinating HDFS functions

Resource Manager: Runs on a master daemon and manages the resource allocation in the cluster. Node Manager: They run on the slave daemons and are responsible for the execution of a task on every single Data Node

Node Managers manage the containers requested by jobs
Data Nodes manage the data
The NodeManager (NM) is YARN’s per-node agent and takes care of the individual compute nodes in a Hadoop cluster. This includes keeping up-to-date with the ResourceManager (RM), overseeing containers’ life-cycle management; monitoring resource usage (memory, CPU) of individual containers, tracking node-health, log’s management, and auxiliary services that may be exploited by different YARN applications. NodeManager communicates directly with the ResourceManager.

Resource manager and Namenode both as master components [processes] that can run in single or HA setup should run on separate identical usually high spec servers [nodes] as compared to the data nodes. Zookeeper is another important component

ResourceManager and NodeManager combine together to form a data-computation framework.

ResourceManager acts as the scheduler and allocates resources amongst all the applications in the system.
NodeManager takes navigation from the ResourceManager and it runs on each node in the cluster. Resources available on a single node is managed by NodeManager.
ApplicationMaster, a framework-specific library is responsible for running specific YARN job and for negotiating resources from the ResourceManager, and working with NodeManager to execute and monitor containers.

Hope that helps

Cloudera Community

Support Questions

relation between Resource Manager and name node ?