Created 02-24-2017 10:34 AM
I am planning to setup a 4 node Production Cluster on Azure VM's. I am planning to have 1 edge Node, 1 Master Node and 2 Slave Nodes. I wanted to setup below mentioned services on that cluster.
1) Namenode
2) Oozie
3) DataNode
4) Yarn
5) Spark
6) Ranger
7) Atlas
😎 Knox
9) Hbase
10) SAP Hana Vora
11) Zookeeper
I am actually looking out for any guidelines on Memory, Cores and Storage to be required for different services of hadoop as mentioned above. I need to buy 4 VM's on Azure but i want to understand from Infrastructure perspective that how much memory, cores, Storage would be optimal for above mentioned hadoop services(service wise) ,keeping in mind more services can also be added in future.
Is there any reference documentation/link?
Any help would be appreciated.
Thanks
Created 02-24-2017 12:17 PM
See this article on best practices for deploying HDP on Azure: https://community.hortonworks.com/articles/22376/recommendations-for-microsoft-azure-hdp-deployment-...
For most production clusters, we typically recommend enabling HA for services. That requires that you have at a minimum 2 master servers, although 3 would be better. You need 3 Zookeeper instances. While you can put Zookeeper on the data nodes, it would be better to put Zookeeper on the master nodes.
Created 02-24-2017 10:46 AM
Thoughts?
Created 02-24-2017 12:17 PM
See this article on best practices for deploying HDP on Azure: https://community.hortonworks.com/articles/22376/recommendations-for-microsoft-azure-hdp-deployment-...
For most production clusters, we typically recommend enabling HA for services. That requires that you have at a minimum 2 master servers, although 3 would be better. You need 3 Zookeeper instances. While you can put Zookeeper on the data nodes, it would be better to put Zookeeper on the master nodes.
Created 02-24-2017 01:12 PM
Thanks!! article is very useful.