Created on 03-14-2016 04:28 PM - edited 09-16-2022 03:09 AM
I'm trying to clean up some of the older nodes on our cluster.
Most of our new data nodes have only 4 components: DataNode, RegionServer, MetricsMonitor, and NodeManager.
I *think* I understand the purpose of each of these.
However, many of our older data nodes have as many as 15 components, including DataNode, RegionServer, MetricsMonitor, NodeManager PLUS:
HCat Client, HDFS Client, HiveClient, MapReduce2 Client, Oozie Client, Tez Client, Pig, Sqoop, YARN Client, Zookeeper Client.
Can someone please point me to some documentation for the purpose of each of these clients?
Will a data node still be leveraged for a Tez query if it doesn't have a Tez Client?
How is the YARN client different from a NodeManager?
What is a zookeeper client?
And most importantly, how can I safely remove the unnecessary elements?
Created 03-14-2016 06:09 PM
You can compare those clients with mysql or oracle client for the better understanding.
HDP client libraries are Java libraries that facilitate communication from a remote host. An HDP client library has all the HDP JAR files on it for communicating with Hive, HDFS, etc. Note that you will not find any HDP service running on the client host machine.
Will a data node still be leveraged for a Tez query if it doesn't have a Tez Client?
Yes
How is the YARN client different from a NodeManager?
NodeManager (NM) is YARN’s per-node agent, and takes care of the individual compute nodes in a Hadoop cluster. This includes keeping up-to date with the ResourceManager (RM), overseeing containers’ life-cycle management; monitoring resource usage (memory, CPU) of individual containers, tracking node-health, log’s management and auxiliary services which may be exploited by different YARN applications.
Yarn client is just command line tool
What is a zookeeper client?
Its just zookeeper command line tool.
Created 03-14-2016 06:09 PM
You can compare those clients with mysql or oracle client for the better understanding.
HDP client libraries are Java libraries that facilitate communication from a remote host. An HDP client library has all the HDP JAR files on it for communicating with Hive, HDFS, etc. Note that you will not find any HDP service running on the client host machine.
Will a data node still be leveraged for a Tez query if it doesn't have a Tez Client?
Yes
How is the YARN client different from a NodeManager?
NodeManager (NM) is YARN’s per-node agent, and takes care of the individual compute nodes in a Hadoop cluster. This includes keeping up-to date with the ResourceManager (RM), overseeing containers’ life-cycle management; monitoring resource usage (memory, CPU) of individual containers, tracking node-health, log’s management and auxiliary services which may be exploited by different YARN applications.
Yarn client is just command line tool
What is a zookeeper client?
Its just zookeeper command line tool.
Created 03-14-2016 07:02 PM
Makes sense.
Thanks!
Created 10-06-2016 08:47 PM
How to remove clients from the worker nodes?
Thanks