- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
What are worker and Edge nodes?
- Labels:
-
Apache Hadoop
Created ‎06-08-2017 06:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is worker node & edge nodes?
Why are w using these nodes?
What is role of these nodes?
What role does it play when a job is executed?
Created ‎06-08-2017 07:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Edge node refers to a dedicated node (machine) where no Hadoop services are running, and where you install only so-called Hadoop clients (hdfs, Hive, HBase etc. clients). In your case your BI tool will also play a role of a Hadoop client. A client means that only respective component client libraries and scripts will be installed, together with its config files. If you change config through Ambari, then Ambari will automatically refresh config files on the edge node as well. In a small, test cluster without an edge node you can select one node where Hadoop services are running (for example, a master node) to play a role of your edge node. (In a large cluster with many users there are usually multiple edge nodes.) As the "edge node folder" you can use any folder on the edge node you decide to use. Usually we execute Sqoop, hdfs, Oozie, ...etc commands from an edgenode. Edgenode is a client-facing machine that has all client tools to operate on a cluster. It is not a good idea to use NameNode or other HDP components as your edgenode. Typically you'd want a separate node designated just for client access.
Worker nodes make up the majority of virtual machines and perform the job of storing the data and running computations. Worker nodes usually runs both a DataNode and NodeManager ..etc kind of services.
https://community.hortonworks.com/questions/87884/which-node-to-use.html
Created ‎06-08-2017 07:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Edge node refers to a dedicated node (machine) where no Hadoop services are running, and where you install only so-called Hadoop clients (hdfs, Hive, HBase etc. clients). In your case your BI tool will also play a role of a Hadoop client. A client means that only respective component client libraries and scripts will be installed, together with its config files. If you change config through Ambari, then Ambari will automatically refresh config files on the edge node as well. In a small, test cluster without an edge node you can select one node where Hadoop services are running (for example, a master node) to play a role of your edge node. (In a large cluster with many users there are usually multiple edge nodes.) As the "edge node folder" you can use any folder on the edge node you decide to use. Usually we execute Sqoop, hdfs, Oozie, ...etc commands from an edgenode. Edgenode is a client-facing machine that has all client tools to operate on a cluster. It is not a good idea to use NameNode or other HDP components as your edgenode. Typically you'd want a separate node designated just for client access.
Worker nodes make up the majority of virtual machines and perform the job of storing the data and running computations. Worker nodes usually runs both a DataNode and NodeManager ..etc kind of services.
https://community.hortonworks.com/questions/87884/which-node-to-use.html
Created ‎06-08-2017 07:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Jay SenSharma
Does worker node will also be available as a dedicated node like edge node? Also when a job is executed does all the intermediate staging data will be stored in worker node? How the worker node access data from data node? Forgive me if these are lame questions. Im trying to understand about worker nodes.
Created ‎04-10-2018 02:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Bala Vignesh N V Your worker node is same as your data node. Worker node are those who actually does the work in the cluster.
