I have often heard the term called "Edge Node" in hadoop architecture discussions.
However when I created a cluster using the cloudera manager. it created a fully functional cluster for me without making any node as the edge node.
so my question is that if I want to follow the best practise of having an edge node and I am using cloudera manager to create a cluster. How do I create an edge node?
A gateway or edge node has the necessary libraries and client components present, as well as current configuration of the cluster. Generally you do not mix them with actual cluster service nodes.
So your edge or gateway node would be a system trusted by the cluster to have end users running CLI based tools like beeline, hdfs tools, etc.
You install and configure agents on the gateway hosts in question (or add via the hosts interface in the cluster to automate install and config) and the from the "instances" UI from Hive you can "add role instances" and then add gatway hosts.
You can add gateways for HDFS as well (to the same new hosts). Check the instance role options for the services you are running on the cluster. Gateway instance deployment is also discussed here;
And Here (briefly) for hive services.
At that point forward you want to use the "deploy client configurations" to push the current cluster config information after saving changes and restarting the relevant services that have gateway instances to be used by clients.
I have the same question on this, basically, I like to know the only way to add the client(gateway node) has to be after cluster installed?
can we install all nodes + edge nodes when install Hadoop but not have any service, not NN or not DN on these edge nodes, but sqoop, nor Hue on these nodes?