Created on 10-11-2016 01:46 PM - edited 09-16-2022 03:44 AM
I read http://stackoverflow.com/questions/8456141/in-a-hadoop-cluster-should-hive-be-installed-on-all-nodes and it says hive should be installed on client machines. Alright. But on Ambari I install Hcatalog and Hive service right? Where should these services stay? What does it mean to install hive on client machine if the service would be installed on cluster too? Or is it like I do not need to install Hive service in Ambari? From what I understand, to add a client machine, only etc/hosts file is edited to declare it as a host and that's it. Or do I add it as a host in ambari and also move Hive service to the client machine?
Also, in that case pig should be on client too?
What about oozie and Ranger? Where should these too be installed?
Created 10-11-2016 02:48 PM
That stack overflow link is not a good reference (oversimplified and incorrect).
If you install through the Ambari management console you will have services assigned to masters and slaves automatically. See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/ch_Deploy_an...
If you are interested in what a large cluster with many services looks like or if you want to do it manually (not preferrable), refer to: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....
Created 10-11-2016 02:48 PM
That stack overflow link is not a good reference (oversimplified and incorrect).
If you install through the Ambari management console you will have services assigned to masters and slaves automatically. See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/ch_Deploy_an...
If you are interested in what a large cluster with many services looks like or if you want to do it manually (not preferrable), refer to: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....
Created 10-11-2016 03:03 PM
It asks for oozie and hive to be installed on master nodes, why is that? Should it not be installed on edge/client nodes?
Created 10-11-2016 03:15 PM
Hi @Simran Kaur. Edge/client nodes are only for user access to the cluster. Having said that, they are not mandatory for a hadoop cluster since users can access through other means (e.g. Ambari views, Zeppelin, WebHDFS, HDFS mounts and other). So edge/client nodes are a bit distracting.
The main architecture to Hadoop is the master-slave architecture of services. At the highest level, services typically have a master that manages a job and slaves that do the work distributed on the cluster. These are never on an edge node (edge node let's the user communicate to the master service).