<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: MapReduce 2 Optimization in Heterogeneous Cluster in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36561#M15232</link>
    <description>&lt;P&gt;A vcore is a virtual core. You can define it however you want.&lt;/P&gt;&lt;P&gt;You could, as an example, define that a vcore is the processing power that is delivered by a 1GHz thread core. A 3GHz core would than be comparable to 3 vcores in the node manager. Your container request then needs to use multiple vcores which handles the difference in speed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Not a lot of clusters do this due to the administrative overhead and the fact that if the end users do not use the vcore correctly it can overload the faster machines.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
    <pubDate>Sun, 24 Jan 2016 23:09:53 GMT</pubDate>
    <dc:creator>Wilfred</dc:creator>
    <dc:date>2016-01-24T23:09:53Z</dc:date>
    <item>
      <title>MapReduce 2 Optimization in Heterogeneous Cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36318#M15229</link>
      <description>&lt;P&gt;I have this configuration:&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; Hadoop: v2 (Yarn)&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; An input file: Size = 100 GB.&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3 Slaves: each has 4 VCORES with Speed = 2 GHz and RAM = 8 GB&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5 Slaves: each has 2 VCORES with Speed = 1 GHz and RAM = 2 GB&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; MapReduce program: WordCount&lt;BR /&gt;&lt;BR /&gt;How can I minimize WordCount execution time by assigning small input splits to the 5 slower slaves and big input splits to the 3 fastest slaves?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:57:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36318#M15229</guid>
      <dc:creator>jalil1408</dc:creator>
      <dc:date>2022-09-16T09:57:49Z</dc:date>
    </item>
    <item>
      <title>Re: MapReduce 2 Optimization in Heterogeneous Cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36455#M15230</link>
      <description>&lt;P&gt;You need to setup the nodes with the proper vcores and memory available for the NM. That should solve the problem. It will put more load on the larger nodes than on the small nodes.&lt;/P&gt;&lt;P&gt;The container is also scheduled on the node based on the data locality which is out of your control.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can however not say start processing of the split on a specific node.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2016 03:12:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36455#M15230</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2016-01-21T03:12:11Z</dc:date>
    </item>
    <item>
      <title>Re: MapReduce 2 Optimization in Heterogeneous Cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36469#M15231</link>
      <description>&lt;P&gt;I wonder why YARN doesn't support VCORE SPEED in container configuration!&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jan 2016 08:54:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36469#M15231</guid>
      <dc:creator>jalil1408</dc:creator>
      <dc:date>2016-01-21T08:54:13Z</dc:date>
    </item>
    <item>
      <title>Re: MapReduce 2 Optimization in Heterogeneous Cluster</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36561#M15232</link>
      <description>&lt;P&gt;A vcore is a virtual core. You can define it however you want.&lt;/P&gt;&lt;P&gt;You could, as an example, define that a vcore is the processing power that is delivered by a 1GHz thread core. A 3GHz core would than be comparable to 3 vcores in the node manager. Your container request then needs to use multiple vcores which handles the difference in speed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Not a lot of clusters do this due to the administrative overhead and the fact that if the end users do not use the vcore correctly it can overload the faster machines.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Wilfred&lt;/P&gt;</description>
      <pubDate>Sun, 24 Jan 2016 23:09:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/MapReduce-2-Optimization-in-Heterogeneous-Cluster/m-p/36561#M15232</guid>
      <dc:creator>Wilfred</dc:creator>
      <dc:date>2016-01-24T23:09:53Z</dc:date>
    </item>
  </channel>
</rss>

