<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Which way of HDP cluster setup is best, having physical nodes or having multiple VMs with few physical nodes in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139984#M32008</link>
    <description>&lt;P&gt;There are pros/cons for both. VMs have a negative impact on performance so we would normally go for bare metal. Mapreduce is good in scaling to lots of discs/processes even on a single data node. &lt;/P&gt;&lt;P&gt;However there are limits on VERY big nodes ( there are new Apollo servers with 24 drives ) you want to increase the HDFS DataNode memory and you may have issues with very big block reports being sent around. In that case logically splitting a node into multiple smaller VMs might solve these issues.&lt;/P&gt;&lt;P&gt;But normally I would say go bare metal. &lt;/P&gt;</description>
    <pubDate>Wed, 15 Jun 2016 20:23:29 GMT</pubDate>
    <dc:creator>bleonhardi</dc:creator>
    <dc:date>2016-06-15T20:23:29Z</dc:date>
    <item>
      <title>Which way of HDP cluster setup is best, having physical nodes or having multiple VMs with few physical nodes</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139982#M32006</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;We have six physical machines. Which way of cluster setup is good? Having those physical Machines as it is (or) Create multiple VMs on top of those machines and create a big cluster?&lt;/P&gt;&lt;P&gt;Those machines are highly available machines with more than 450 GB of RAM.&lt;/P&gt;&lt;P&gt;Please Suggest me!&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jun 2016 16:10:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139982#M32006</guid>
      <dc:creator>uday_vakalapudi</dc:creator>
      <dc:date>2016-06-15T16:10:40Z</dc:date>
    </item>
    <item>
      <title>Re: Which way of HDP cluster setup is best, having physical nodes or having multiple VMs with few physical nodes</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139983#M32007</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/9657/udayvakalapudi.html" nodeid="9657"&gt;@Uday Vakalapudi&lt;/A&gt; Typically you will always be better off with multiple machines (scale out) rather than a smaller number of large machines (scale up).&lt;/P&gt;&lt;P&gt;If you consider the way that Hadoop works, jobs are effectively distributed across the whole cluster and all the resources can be utilised simultaneously. This is the opposite of what virtualisation typically handles, which is multiple machines with different workloads and different workload profiles (I/O, cpu, memory).&lt;/P&gt;&lt;P&gt;My short suggestion would be if you're just looking at a test/dev/pilot system, then multiple VM's is fine. But for production, consider scale out on bare metal.&lt;/P&gt;&lt;P&gt;Hope that helps.&lt;/P&gt;,&lt;P&gt;Typically you will always be better off with multiple machines (scale out) rather than a smaller number of large machines (scale up).&lt;/P&gt;&lt;P&gt;If you consider the way that Hadoop works, jobs are effectively distributed across the whole cluster and all the resources can be utilised simultaneously. This is the opposite of what virtualisation typically handles, which is multiple machines with different workloads and different workload profiles (I/O, cpu, memory).&lt;/P&gt;&lt;P&gt;My short suggestion would be if you're just looking at a test/dev/pilot system, then multiple VM's is fine. But for production, consider scale out on bare metal.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jun 2016 18:46:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139983#M32007</guid>
      <dc:creator>drussell</dc:creator>
      <dc:date>2016-06-15T18:46:32Z</dc:date>
    </item>
    <item>
      <title>Re: Which way of HDP cluster setup is best, having physical nodes or having multiple VMs with few physical nodes</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139984#M32008</link>
      <description>&lt;P&gt;There are pros/cons for both. VMs have a negative impact on performance so we would normally go for bare metal. Mapreduce is good in scaling to lots of discs/processes even on a single data node. &lt;/P&gt;&lt;P&gt;However there are limits on VERY big nodes ( there are new Apollo servers with 24 drives ) you want to increase the HDFS DataNode memory and you may have issues with very big block reports being sent around. In that case logically splitting a node into multiple smaller VMs might solve these issues.&lt;/P&gt;&lt;P&gt;But normally I would say go bare metal. &lt;/P&gt;</description>
      <pubDate>Wed, 15 Jun 2016 20:23:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Which-way-of-HDP-cluster-setup-is-best-having-physical-nodes/m-p/139984#M32008</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-06-15T20:23:29Z</dc:date>
    </item>
  </channel>
</rss>

