<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103621#M33350</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/186/sshaw.html" nodeid="186"&gt;@Scott Shaw&lt;/A&gt;Thanks for the prompt answer! &lt;/P&gt;&lt;P&gt;Among the frameworks required for this cluster (Hadoop, Hive, Pig, Oozie, HBase, Zookeeper, Spark, Storm and Sqoop, Kafka) is there any classification on the basis of intensiveness of IO , Computation and Memory? I might be wrong but are there frameworks in the list which would be  both IO intensive and Memory intensive?&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;</description>
    <pubDate>Thu, 30 Jun 2016 12:26:37 GMT</pubDate>
    <dc:creator>rahulkumar_mish</dc:creator>
    <dc:date>2016-06-30T12:26:37Z</dc:date>
    <item>
      <title>Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103618#M33347</link>
      <description>&lt;P&gt;Dear folks,&lt;/P&gt;&lt;P&gt;I am currently trying to set up HDP 2.4 on a small cluster for carrying out PoC activities but I am clueless on the aspects of what heuristics to use for &lt;STRONG&gt;assigning masters&lt;/STRONG&gt; and &lt;STRONG&gt;assigning slaves and clients&lt;/STRONG&gt; after launching install wizard. I had started with documentation provided here: &lt;A href="https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Ready.html"&gt;https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Installing_HDP_AMB/content/ch_Getting_Ready.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Description of cluster:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Small cluster with 8 machines. Each machine having 8 GB RAM, 6-8 cores and 500 GB. One machine I am using for ambari, and rest 7 machines are for namenode, secondary namenode and data nodes. All the nodes are installed with CentOS 6. Availability and reliability aren’t of any concern as it is a PoC cluster where some algorithms will be tested out for its functionality.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Frameworks required on cluster&lt;/STRONG&gt;: Hadoop, Hive, Pig, Oozie, HBase, Zookeeper, Spark, Storm and Sqoop, Kafka&lt;/P&gt;&lt;P&gt;In order to get my feet wet, I had chosen Ambari and HDP2.4.0 and ease of deploying a cluster has been a positive experience till now, with nice documentation and my decent knowledge of Linux.&lt;/P&gt;&lt;P&gt;Going forward I wanted to know from experts on what heuristics and logic do they use for assigning masters and slaves. Most of the resources that I have found on this community and elsewhere discuss about the heuristics on system configurations (RAM, memory and cores) and are pretty logically concluded for a heterogeneous cluster and the takeaways are important heuristics which could make clusters efficient.&lt;/P&gt;&lt;P&gt;But given a homogeneous cluster, I am totally clueless about how to proceed.&lt;/P&gt;&lt;P&gt;Any concrete or abstract ideas is much appreciated.&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jun 2016 18:22:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103618#M33347</guid>
      <dc:creator>rahulkumar_mish</dc:creator>
      <dc:date>2016-06-29T18:22:26Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103619#M33348</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11063/rahulkumarmishra.html" nodeid="11063"&gt;@Rahul Mishra&lt;/A&gt; You may want to start with this documentation &lt;A href="http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardware-recommendations_chapter.html" target="_blank"&gt;http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_cluster-planning-guide/content/ch_hardware-recommendations_chapter.html&lt;/A&gt;. For small clusters like yours where HA isn't a concern you basically are dealing with only 2 types of nodes - master and worker nodes. I certainly wouldn't over-architect it. For an 8 node cluster you would have your Ambari Server which can also hold your client services, 2 master nodes, and finally 5 worker nodes. &lt;/P&gt;&lt;P&gt;If you have a homogeneous cluster like yours where each node has low resources, you're primary concern is co-locating services requiring the same type of resources. For example, it would be ok to have an in-memory service like Spark co-exist with a more IO intensive service, but not 2 in-memory intensive services on the same node. &lt;/P&gt;&lt;P&gt;In your case you'll just have to build it out and monitor and be aware that running certain operations together may cause performance issues. The good thing about HDP is its ability to scale so you are never really quite "locked-in" to a particular architecture.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Jun 2016 23:33:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103619#M33348</guid>
      <dc:creator>SQLShaw</dc:creator>
      <dc:date>2016-06-29T23:33:47Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103620#M33349</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11063/rahulkumarmishra.html" nodeid="11063"&gt;@Rahul Mishra&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Following is an example based on the above comment:&lt;/P&gt;&lt;P&gt;Node1: Ambari Server, Primary Namenode, Zookeeper, Hbase Master, Clients&lt;/P&gt;&lt;P&gt;Node2: Secondary/Standby Namenode, Hive services, PIG, OOzie, zookeeper&lt;/P&gt;&lt;P&gt;Node3: YARN, Spark, Sqoop, Kafka, Ambari Metrics Collector/Grafana, zookeeper&lt;/P&gt;&lt;P&gt;Node4-8: Datanode, Nodemanager, Hbase Region Servers, Clients&lt;/P&gt;</description>
      <pubDate>Thu, 30 Jun 2016 01:43:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103620#M33349</guid>
      <dc:creator>vpoornalingam</dc:creator>
      <dc:date>2016-06-30T01:43:20Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103621#M33350</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/186/sshaw.html" nodeid="186"&gt;@Scott Shaw&lt;/A&gt;Thanks for the prompt answer! &lt;/P&gt;&lt;P&gt;Among the frameworks required for this cluster (Hadoop, Hive, Pig, Oozie, HBase, Zookeeper, Spark, Storm and Sqoop, Kafka) is there any classification on the basis of intensiveness of IO , Computation and Memory? I might be wrong but are there frameworks in the list which would be  both IO intensive and Memory intensive?&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Rahul&lt;/P&gt;</description>
      <pubDate>Thu, 30 Jun 2016 12:26:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103621#M33350</guid>
      <dc:creator>rahulkumar_mish</dc:creator>
      <dc:date>2016-06-30T12:26:37Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103622#M33351</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/350/vpoornalingam.html" nodeid="350"&gt;@vpoornalingam&lt;/A&gt; Many thanks for the prompt answer!&lt;/P&gt;</description>
      <pubDate>Thu, 30 Jun 2016 12:27:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103622#M33351</guid>
      <dc:creator>rahulkumar_mish</dc:creator>
      <dc:date>2016-06-30T12:27:35Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103623#M33352</link>
      <description>&lt;P&gt;You are welcome! Please accept &lt;A rel="user" href="https://community.cloudera.com/users/186/sshaw.html" nodeid="186"&gt;@Scott Shaw&lt;/A&gt;'s answer!&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jul 2016 02:54:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103623#M33352</guid>
      <dc:creator>vpoornalingam</dc:creator>
      <dc:date>2016-07-01T02:54:35Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103624#M33353</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/350/vpoornalingam.html" nodeid="350"&gt;@vpoornalingam&lt;/A&gt; One more doubt : where should I be putting History server, App timeline server and resource manager?&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jul 2016 17:32:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103624#M33353</guid>
      <dc:creator>rahulkumar_mish</dc:creator>
      <dc:date>2016-07-01T17:32:58Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103625#M33354</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/11063/rahulkumarmishra.html"&gt;@Rahul Mishra &lt;/A&gt;&lt;/P&gt;&lt;P&gt;History Server,App Timeline, and RM are part of YARN &amp;amp; MR master component, In the @Venkat layout, It will on node3 .Also, suggest placing Kafka in a different server as its an ingestion component.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jul 2016 12:04:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103625#M33354</guid>
      <dc:creator>Jagatheeshr</dc:creator>
      <dc:date>2016-07-06T12:04:41Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103626#M33355</link>
      <description>&lt;P&gt;Hi,
Your discussion is valuable, as in HDP documentation there is nearly no information regarding small cluster planning and HDP components' placement between servers...&lt;/P&gt;&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/189/jramakrishnan.html"&gt;Jagatheesh Ramakrishnan&lt;/A&gt;&lt;/P&gt;&lt;P&gt;How would you reconsider the configuration and components placement in case we have 3 master nodes and (on start) 3 data nodes. The number of data nodes will increase when needed.
For master node we are planning - 32GB RAM and 250GB HDD (if needed can be more memory and HDD), for data node - 24GB RAM and 8TB HDD.
Redundancy/HA of all components &lt;STRONG&gt;is a must&lt;/STRONG&gt; as configuration will be used for production environment. Single node (even master) failure/restart/unavailability shall not disrupt any functionality.
As a redundant database solution for all components there will be 3-node MySQL active cluster (probably Percona XTRADB Cluster) used for all components requiring DB access. It will be located on master nodes.&lt;/P&gt;&lt;P&gt;,
&lt;/P&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Thank you for interesting information. Unfortunately in HDP documentation there is nearly no information regarding small cluster planning and components placement...&lt;/P&gt;&lt;P&gt;How would you reconsider the configuration in case we have 3 master nodes and (on start) 3 data nodes. The number of data nodes will increase when needed.&lt;/P&gt;&lt;P&gt;For master node- 32GB RAM and 250GB HDD (if needed can be more memory and HDD), for data node - 24GB RAM and 8TB HDD.&lt;/P&gt;&lt;P&gt;Redundancy/HA of all components is a must as it is for production environment. Single node (even master) failure/restart/unavailability shall not disrupt any functionality.&lt;/P&gt;&lt;P&gt;As a redundant database for all components there will be 3-node MySQL cluster (probably Percona XTRADB Cluster) &lt;/P&gt;&lt;P&gt;used. It will be located on master nodes.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Apr 2017 18:30:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103626#M33355</guid>
      <dc:creator>michal_tomaszew</dc:creator>
      <dc:date>2017-04-18T18:30:19Z</dc:date>
    </item>
    <item>
      <title>Re: Any guidelines on assigning masters and assigning slaves and clients for a homogeneous PoC cluster?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103627#M33356</link>
      <description>&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/350/vpoornalingam.html" nodeid="350"&gt;@vpoornalingam&lt;/A&gt;   &lt;/P&gt;&lt;P&gt;Can I know why do you want to install clients in the Node1. It's master node right. From node4-8 if you install on these node it's fine so that clients can access the services from there. But why specifically on master node was my question.&lt;/P&gt;&lt;P&gt;Could you provide reason for it if possible.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sat, 09 Dec 2017 06:01:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Any-guidelines-on-assigning-masters-and-assigning-slaves-and/m-p/103627#M33356</guid>
      <dc:creator>shashankkumar_m</dc:creator>
      <dc:date>2017-12-09T06:01:32Z</dc:date>
    </item>
  </channel>
</rss>

