<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Using Yarn as resource manager for standalone Python and R code in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-Yarn-as-resource-manager-for-standalone-Python-and-R/m-p/59102#M67030</link>
    <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it possible to run "standalone" Python and R code using Yarn as a resource manager? "Standalone" means native Python or R code, not Spark jobs following by Python-&amp;gt;PySpark-&amp;gt;Spark or R-&amp;gt;SparklyR-&amp;gt;Spark execution. We want to use Yarn as a resource allocation service to run Python and R code in allocated by Yarn containers inside cluster node.&amp;nbsp;Obviously, it's not distributed execution, still standalone, but worker node supposed to be allocated by Yarn. CDH 5.10.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 12:07:34 GMT</pubDate>
    <dc:creator>DButakov</dc:creator>
    <dc:date>2022-09-16T12:07:34Z</dc:date>
    <item>
      <title>Using Yarn as resource manager for standalone Python and R code</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-Yarn-as-resource-manager-for-standalone-Python-and-R/m-p/59102#M67030</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is it possible to run "standalone" Python and R code using Yarn as a resource manager? "Standalone" means native Python or R code, not Spark jobs following by Python-&amp;gt;PySpark-&amp;gt;Spark or R-&amp;gt;SparklyR-&amp;gt;Spark execution. We want to use Yarn as a resource allocation service to run Python and R code in allocated by Yarn containers inside cluster node.&amp;nbsp;Obviously, it's not distributed execution, still standalone, but worker node supposed to be allocated by Yarn. CDH 5.10.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 12:07:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-Yarn-as-resource-manager-for-standalone-Python-and-R/m-p/59102#M67030</guid>
      <dc:creator>DButakov</dc:creator>
      <dc:date>2022-09-16T12:07:34Z</dc:date>
    </item>
    <item>
      <title>Re: Using Yarn as resource manager for standalone Python and R code</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-Yarn-as-resource-manager-for-standalone-Python-and-R/m-p/59605#M67031</link>
      <description>Yes. Use of YARN APIs will allow you to distribute and run any arbitrary command. Spark and MR2 are apps that leverage this to run Java commands with wrapper classes that drive their logic and flow, but there's nothing preventing you from writing your own.&lt;BR /&gt;&lt;BR /&gt;Take a look at the Distributed Shell application implementation to understand the raw YARN APIs used to run arbitrary commands via YARN allocated resource containers: &lt;A href="https://github.com/cloudera/hadoop-common/blob/cdh5.12.0-release/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java#L201" target="_blank"&gt;https://github.com/cloudera/hadoop-common/blob/cdh5.12.0-release/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java#L201&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;If you're asking of an inbuilt way of running programs over YARN without any code, then aside of the DistributedShell there's no other included implementation. Even with the DistributedShell you may not really get the tight integration (such as result extraction, status viewing, etc.) you require.&lt;BR /&gt;&lt;BR /&gt;There's likely a few more higher level frameworks that can make things easier when developing custom YARN apps, such as Spring (&lt;A href="https://spring.io/guides/gs/yarn-basic/" target="_blank"&gt;https://spring.io/guides/gs/yarn-basic/&lt;/A&gt;), Kitten (&lt;A href="https://github.com/cloudera/kitten" target="_blank"&gt;https://github.com/cloudera/kitten&lt;/A&gt;), Cask's CDAP (&lt;A href="https://docs.cask.co/cdap/current/en/developers-manual/getting-started/index.html" target="_blank"&gt;https://docs.cask.co/cdap/current/en/developers-manual/getting-started/index.html&lt;/A&gt;).</description>
      <pubDate>Wed, 06 Sep 2017 06:59:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-Yarn-as-resource-manager-for-standalone-Python-and-R/m-p/59605#M67031</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2017-09-06T06:59:53Z</dc:date>
    </item>
  </channel>
</rss>

