<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to make ApplicationMaster run several containers on specific nodes based on data locality? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/391805#M247782</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/99648"&gt;@husseljo&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Please mark this "Accept as solution" if you find my answer helped you.&lt;/P&gt;</description>
    <pubDate>Fri, 16 Aug 2024 16:06:50 GMT</pubDate>
    <dc:creator>AyazHussain</dc:creator>
    <dc:date>2024-08-16T16:06:50Z</dc:date>
    <item>
      <title>How to make ApplicationMaster run several containers on specific nodes based on data locality?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/349651#M235711</link>
      <description>&lt;P&gt;Basically, I'm attempting to create a standalone Yarn application that runs code on hdfs files on a block level for validation purposes. First I will need to get where the different blocks are distributed (hdfs file metadata), and will run validation code on every block and will combine the results to determine if the file is valid or not. I want to minimize/eliminate network overhead by making the application validation code run on the same node on which the block resides (data locality).&lt;/P&gt;&lt;P&gt;So, for instance, a 1&amp;nbsp;GB file could be divided into 10 blocks. In order to run the same validation code on all blocks in parallel, I would want to run 10 different instances.&lt;/P&gt;&lt;P&gt;My thinking is, I would launch a single ApplicationMaster with 10 containers.&lt;/P&gt;&lt;P&gt;Does my approach make sense, and if not, what do you suggest I change in my approach? And if it does make sense, how do I launch 10 different instances while (possibly) determining 10 different hosts for every container.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Aug 2022 22:14:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/349651#M235711</guid>
      <dc:creator>husseljo</dc:creator>
      <dc:date>2022-08-05T22:14:21Z</dc:date>
    </item>
    <item>
      <title>Re: How to make ApplicationMaster run several containers on specific nodes based on data locality?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/391451#M247631</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/99648"&gt;@husseljo&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Application master will be a single container that will run as AM.&lt;/P&gt;&lt;P&gt;Capacity Scheduler leverages&amp;nbsp;Delay Scheduling&amp;nbsp;to honor task locality constraints. There are three levels of locality constraint: node-local, rack-local, and off-switch. The scheduler counts the number of missed opportunities when the locality cannot be satisfied and waits for this count to reach a threshold before relaxing the locality constraint to the next level. You can configure this threshold using the&amp;nbsp;Node Locality Delay&amp;nbsp;(yarn.scheduler.capacity.node-locality-delay) and&amp;nbsp;Rack Locality Additional Delay&amp;nbsp;(yarn.scheduler.capacity.rack-locality-additional-delay) fields&lt;/P&gt;</description>
      <pubDate>Wed, 07 Aug 2024 10:58:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/391451#M247631</guid>
      <dc:creator>AyazHussain</dc:creator>
      <dc:date>2024-08-07T10:58:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to make ApplicationMaster run several containers on specific nodes based on data locality?</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/391805#M247782</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/99648"&gt;@husseljo&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;Please mark this "Accept as solution" if you find my answer helped you.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Aug 2024 16:06:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-make-ApplicationMaster-run-several-containers-on/m-p/391805#M247782</guid>
      <dc:creator>AyazHussain</dc:creator>
      <dc:date>2024-08-16T16:06:50Z</dc:date>
    </item>
  </channel>
</rss>

