<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: What is the difference between spark_shuffle &amp;amp; spark2_shuffle in yarn.nodemanager.aux-services in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/399041#M250379</link>
    <description>&lt;P&gt;&lt;FONT size="2"&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/118899"&gt;@allen_chu&lt;/a&gt;&amp;nbsp;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;Maybe I didn't understand the question well but here are the differences and explanation to help you understand and configure the 2 options correctly&lt;/FONT&gt;&lt;/P&gt;&lt;H3&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;1. Difference Between spark_shuffle and spark2_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H3&gt;&lt;H4&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;spark_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H4&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Used for &lt;STRONG&gt;Apache Spark 1.x&lt;/STRONG&gt; versions.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Refers to the shuffle service for older Spark releases that rely on the original shuffle mechanism.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Declared in YARN Node Manager's configuration (yarn-site.xml):&lt;/FONT&gt;&lt;DIV class="contain-inline-size rounded-md border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary dark:bg-gray-950"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;H4&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H4&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Introduced for &lt;STRONG&gt;Apache Spark 2.x and later&lt;/STRONG&gt; versions.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Handles shuffle operations for newer Spark versions, which have an updated shuffle mechanism with better performance and scalability.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Declared similarly in yarn-site.xml:&lt;/FONT&gt;&lt;DIV class="contain-inline-size rounded-md border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary dark:bg-gray-950"&gt;&lt;DIV class="flex items-center text-token-text-secondary px-4 py-2 text-xs font-sans justify-between rounded-t-md h-9 bg-token-sidebar-surface-primary dark:bg-token-main-surface-secondary select-none"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="sticky top-9 md:top-[5.75rem]"&gt;&lt;DIV class="absolute bottom-0 right-2 flex h-9 items-center"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark2_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class="overflow-y-auto p-4"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;H3&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;2. Why Two Shuffle Services?&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Backward Compatibility&lt;/STRONG&gt;: spark_shuffle is retained for Spark 1.x jobs to continue running without modifications.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Separate Service&lt;/STRONG&gt;: spark2_shuffle ensures that jobs running on Spark 2.x+ use an optimized and compatible shuffle service without interfering with Spark 1.x jobs.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Upgrade Path&lt;/STRONG&gt;: In clusters supporting multiple Spark versions, both shuffle services may coexist to support jobs submitted using Spark 1.x and Spark 2.x simultaneously.&lt;/FONT&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;3. Configuration in YARN&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;To enable the shuffle service for both versions, configure the NodeManager to start both services:&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark_shuffle,spark2_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services.spark_shuffle.class&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;org.apache.spark.network.yarn.YarnShuffleService&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services.spark2_shuffle.class&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;org.apache.spark.network.yarn.YarnShuffleService&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;4. Key Points&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Use &lt;STRONG&gt;spark_shuffle&lt;/STRONG&gt; for jobs running with Spark 1.x.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Use &lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt; for jobs running with Spark 2.x or later.&lt;/FONT&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;FONT size="2"&gt;In modern setups, &lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt; is the primary shuffle service since Spark 1.x is largely deprecated.&lt;BR /&gt;&lt;BR /&gt;Happy hadooping&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 18 Dec 2024 20:28:30 GMT</pubDate>
    <dc:creator>Shelton</dc:creator>
    <dc:date>2024-12-18T20:28:30Z</dc:date>
    <item>
      <title>What is the difference between spark_shuffle &amp; spark2_shuffle in yarn.nodemanager.aux-services</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/397995#M250034</link>
      <description>&lt;P&gt;In Apache Spark, spark_shuffle and spark2_shuffle are configuration options related to Spark's shuffle operations, which can be set to start auxiliary services within the Yarn NodeManager. But what is the difference between these two?&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 00:39:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/397995#M250034</guid>
      <dc:creator>allen_chu</dc:creator>
      <dc:date>2024-11-26T00:39:53Z</dc:date>
    </item>
    <item>
      <title>Re: What is the difference between spark_shuffle &amp; spark2_shuffle in yarn.nodemanager.aux-services</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/399041#M250379</link>
      <description>&lt;P&gt;&lt;FONT size="2"&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/118899"&gt;@allen_chu&lt;/a&gt;&amp;nbsp;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="2"&gt;Maybe I didn't understand the question well but here are the differences and explanation to help you understand and configure the 2 options correctly&lt;/FONT&gt;&lt;/P&gt;&lt;H3&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;1. Difference Between spark_shuffle and spark2_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H3&gt;&lt;H4&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;spark_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H4&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Used for &lt;STRONG&gt;Apache Spark 1.x&lt;/STRONG&gt; versions.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Refers to the shuffle service for older Spark releases that rely on the original shuffle mechanism.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Declared in YARN Node Manager's configuration (yarn-site.xml):&lt;/FONT&gt;&lt;DIV class="contain-inline-size rounded-md border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary dark:bg-gray-950"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;H4&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H4&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Introduced for &lt;STRONG&gt;Apache Spark 2.x and later&lt;/STRONG&gt; versions.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Handles shuffle operations for newer Spark versions, which have an updated shuffle mechanism with better performance and scalability.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Declared similarly in yarn-site.xml:&lt;/FONT&gt;&lt;DIV class="contain-inline-size rounded-md border-[0.5px] border-token-border-medium relative bg-token-sidebar-surface-primary dark:bg-gray-950"&gt;&lt;DIV class="flex items-center text-token-text-secondary px-4 py-2 text-xs font-sans justify-between rounded-t-md h-9 bg-token-sidebar-surface-primary dark:bg-token-main-surface-secondary select-none"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="sticky top-9 md:top-[5.75rem]"&gt;&lt;DIV class="absolute bottom-0 right-2 flex h-9 items-center"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark2_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class="overflow-y-auto p-4"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;H3&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;2. Why Two Shuffle Services?&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/H3&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Backward Compatibility&lt;/STRONG&gt;: spark_shuffle is retained for Spark 1.x jobs to continue running without modifications.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Separate Service&lt;/STRONG&gt;: spark2_shuffle ensures that jobs running on Spark 2.x+ use an optimized and compatible shuffle service without interfering with Spark 1.x jobs.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;Upgrade Path&lt;/STRONG&gt;: In clusters supporting multiple Spark versions, both shuffle services may coexist to support jobs submitted using Spark 1.x and Spark 2.x simultaneously.&lt;/FONT&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;3. Configuration in YARN&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;To enable the shuffle service for both versions, configure the NodeManager to start both services:&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;mapreduce_shuffle,spark_shuffle,spark2_shuffle&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services.spark_shuffle.class&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;org.apache.spark.network.yarn.YarnShuffleService&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;
&amp;lt;property&amp;gt;
    &amp;lt;name&amp;gt;yarn.nodemanager.aux-services.spark2_shuffle.class&amp;lt;/name&amp;gt;
    &amp;lt;value&amp;gt;org.apache.spark.network.yarn.YarnShuffleService&amp;lt;/value&amp;gt;
&amp;lt;/property&amp;gt;&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;&lt;STRONG&gt;4. Key Points&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Use &lt;STRONG&gt;spark_shuffle&lt;/STRONG&gt; for jobs running with Spark 1.x.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT size="2"&gt;Use &lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt; for jobs running with Spark 2.x or later.&lt;/FONT&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;FONT size="2"&gt;In modern setups, &lt;STRONG&gt;spark2_shuffle&lt;/STRONG&gt; is the primary shuffle service since Spark 1.x is largely deprecated.&lt;BR /&gt;&lt;BR /&gt;Happy hadooping&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 18 Dec 2024 20:28:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/399041#M250379</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2024-12-18T20:28:30Z</dc:date>
    </item>
    <item>
      <title>Re: What is the difference between spark_shuffle &amp; spark2_shuffle in yarn.nodemanager.aux-services</title>
      <link>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/399175#M250430</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/20288"&gt;@Shelton&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV class="group/conversation-turn relative flex w-full min-w-0 flex-col agent-turn"&gt;&lt;DIV class="flex-col gap-1 md:gap-3"&gt;&lt;DIV class="flex max-w-full flex-col flex-grow"&gt;&lt;DIV class="min-h-8 text-message flex w-full flex-col items-end gap-2 whitespace-normal break-words text-start [.text-message+&amp;amp;]:mt-5"&gt;&lt;DIV class="flex w-full flex-col gap-1 empty:hidden first:pt-[3px]"&gt;&lt;DIV class="markdown prose w-full break-words dark:prose-invert light"&gt;&lt;P&gt;Thank you for your reply. This information is very helpful.&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 20 Dec 2024 02:44:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/What-is-the-difference-between-spark-shuffle-amp-spark2/m-p/399175#M250430</guid>
      <dc:creator>allen_chu</dc:creator>
      <dc:date>2024-12-20T02:44:27Z</dc:date>
    </item>
  </channel>
</rss>

