Support Questions

Find answers, ask questions, and share your expertise

where to install hive pig oozie and ranger on a multinode hadoop cluster

avatar
Expert Contributor

I read http://stackoverflow.com/questions/8456141/in-a-hadoop-cluster-should-hive-be-installed-on-all-nodes and it says hive should be installed on client machines. Alright. But on Ambari I install Hcatalog and Hive service right? Where should these services stay? What does it mean to install hive on client machine if the service would be installed on cluster too? Or is it like I do not need to install Hive service in Ambari? From what I understand, to add a client machine, only etc/hosts file is edited to declare it as a host and that's it. Or do I add it as a host in ambari and also move Hive service to the client machine?

Also, in that case pig should be on client too?

What about oozie and Ranger? Where should these too be installed?

1 ACCEPTED SOLUTION

avatar
Guru

That stack overflow link is not a good reference (oversimplified and incorrect).

If you install through the Ambari management console you will have services assigned to masters and slaves automatically. See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/ch_Deploy_an...

If you are interested in what a large cluster with many services looks like or if you want to do it manually (not preferrable), refer to: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

View solution in original post

3 REPLIES 3

avatar
Guru

That stack overflow link is not a good reference (oversimplified and incorrect).

If you install through the Ambari management console you will have services assigned to masters and slaves automatically. See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-installation/content/ch_Deploy_an...

If you are interested in what a large cluster with many services looks like or if you want to do it manually (not preferrable), refer to: https://community.hortonworks.com/articles/16763/cheat-sheet-and-tips-for-a-custom-install-of-horto....

avatar
Expert Contributor

It asks for oozie and hive to be installed on master nodes, why is that? Should it not be installed on edge/client nodes?

avatar
Guru

Hi @Simran Kaur. Edge/client nodes are only for user access to the cluster. Having said that, they are not mandatory for a hadoop cluster since users can access through other means (e.g. Ambari views, Zeppelin, WebHDFS, HDFS mounts and other). So edge/client nodes are a bit distracting.

The main architecture to Hadoop is the master-slave architecture of services. At the highest level, services typically have a master that manages a job and slaves that do the work distributed on the cluster. These are never on an edge node (edge node let's the user communicate to the master service).