- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
4 node cluster configuration
- Labels:
-
Apache Ambari
Created ‎04-27-2017 04:47 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm setting up a 4 node cluster (1x phsical and 3x virtual). The physical machine is to be the master/edge and the 3x VMs are to be the datanodes. My question is when using Ambari 2.5.0.3 "assigning the masters" do I keep everything on the intended master (including the secondary namenode) and only run one zookeeper server/metrics collector/activity analyzer/activity explorer, or do I place the secondary namenode on one of the datanodes along with a zookeeper server/metrics collector/activity analyzer/activity explorer on each of the datanodes?
My intent is to have the physical machine act as the client/edgenode and the VMs to just handle data. Any advice is appreciated.
Thanks in advance.
Created ‎04-27-2017 08:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Joshua Petree, what is the purpose of this cluster?
For any cluster that's beyond a Dev sandbox, you need 3 to 5 masters. In order for Zookeeper to function properly, you need at least three ZK instances. It's not recommended to run a Secondary NameNode or any other services, such as ZK, on a DataNode. Also, in order for HDFS to be HA, you need to run a Standby NameNode.
Remember that Hadoop is designed with the assumption that DataNodes will fail. If you start putting critical services on DataNodes, not only will it hurt your performance, it will create points of failure that will affect the overall health of the cluster.
Created ‎04-27-2017 08:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Joshua Petree, what is the purpose of this cluster?
For any cluster that's beyond a Dev sandbox, you need 3 to 5 masters. In order for Zookeeper to function properly, you need at least three ZK instances. It's not recommended to run a Secondary NameNode or any other services, such as ZK, on a DataNode. Also, in order for HDFS to be HA, you need to run a Standby NameNode.
Remember that Hadoop is designed with the assumption that DataNodes will fail. If you start putting critical services on DataNodes, not only will it hurt your performance, it will create points of failure that will affect the overall health of the cluster.
Created ‎04-27-2017 08:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is what I was expecting, but this is what I am given to work with, sadly. Thank you for your input. I am hoping if this build goes well, then I can convence the "powers at be" for a bigger budget to build a proper cluster. Thank you again.
