- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Best Practices for Deploying Production Hadoop Cluster
- Labels:
-
Apache Ambari
Created ‎02-24-2017 10:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am planning to setup a 4 node Production Cluster on Azure VM's. I am planning to have 1 edge Node, 1 Master Node and 2 Slave Nodes. I wanted to setup below mentioned services on that cluster.
1) Namenode
2) Oozie
3) DataNode
4) Yarn
5) Spark
6) Ranger
7) Atlas
😎 Knox
9) Hbase
10) SAP Hana Vora
11) Zookeeper
I am actually looking out for any guidelines on Memory, Cores and Storage to be required for different services of hadoop as mentioned above. I need to buy 4 VM's on Azure but i want to understand from Infrastructure perspective that how much memory, cores, Storage would be optimal for above mentioned hadoop services(service wise) ,keeping in mind more services can also be added in future.
Is there any reference documentation/link?
Any help would be appreciated.
Thanks
Created ‎02-24-2017 12:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
See this article on best practices for deploying HDP on Azure: https://community.hortonworks.com/articles/22376/recommendations-for-microsoft-azure-hdp-deployment-...
For most production clusters, we typically recommend enabling HA for services. That requires that you have at a minimum 2 master servers, although 3 would be better. You need 3 Zookeeper instances. While you can put Zookeeper on the data nodes, it would be better to put Zookeeper on the master nodes.
Created ‎02-24-2017 10:46 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thoughts?
Created ‎02-24-2017 12:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
See this article on best practices for deploying HDP on Azure: https://community.hortonworks.com/articles/22376/recommendations-for-microsoft-azure-hdp-deployment-...
For most production clusters, we typically recommend enabling HA for services. That requires that you have at a minimum 2 master servers, although 3 would be better. You need 3 Zookeeper instances. While you can put Zookeeper on the data nodes, it would be better to put Zookeeper on the master nodes.
Created ‎02-24-2017 01:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks!! article is very useful.
