<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Choose Spark version from Oozie job (1.6 vs 2.0) in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210283#M172225</link>
    <description>&lt;P&gt;In the latest HDP 2.6.x, Oozie works with either Spark 1 or Spark 2 - it's not side-by-side deployments.  &lt;/P&gt;&lt;P&gt;You can follow &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html"&gt;these instructions&lt;/A&gt; to have Oozie work with different versions of Spark.&lt;/P&gt;</description>
    <pubDate>Fri, 16 Jun 2017 04:42:15 GMT</pubDate>
    <dc:creator>dsun</dc:creator>
    <dc:date>2017-06-16T04:42:15Z</dc:date>
    <item>
      <title>Choose Spark version from Oozie job (1.6 vs 2.0)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210282#M172224</link>
      <description>&lt;P&gt;I have both Spark 1.6 and 2.0 installed on my cluster.  I see in the docs how to manually run a spark-submit job and choose 2.0 &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/spark-choose-version.html"&gt;here&lt;/A&gt;.  However, I launch my jobs using Oozie.  Is there a way to specify for a given Oozie workflow spark action that I want to use the 2.0 engine vs 1.6?  &lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2017 03:15:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210282#M172224</guid>
      <dc:creator>tshiels</dc:creator>
      <dc:date>2017-06-16T03:15:34Z</dc:date>
    </item>
    <item>
      <title>Re: Choose Spark version from Oozie job (1.6 vs 2.0)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210283#M172225</link>
      <description>&lt;P&gt;In the latest HDP 2.6.x, Oozie works with either Spark 1 or Spark 2 - it's not side-by-side deployments.  &lt;/P&gt;&lt;P&gt;You can follow &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html"&gt;these instructions&lt;/A&gt; to have Oozie work with different versions of Spark.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2017 04:42:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210283#M172225</guid>
      <dc:creator>dsun</dc:creator>
      <dc:date>2017-06-16T04:42:15Z</dc:date>
    </item>
    <item>
      <title>Re: Choose Spark version from Oozie job (1.6 vs 2.0)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210284#M172226</link>
      <description>&lt;P&gt;Thank you dsun!  I'm working on these steps today.  It seems from the instructions that once the sharelib for spark2 is setup, I can switch a given workflow to point to spark2 by specifying in job.properties:&lt;/P&gt;&lt;PRE&gt;oozie.action.sharelib.for.spark=spark2&lt;/PRE&gt;&lt;P&gt;This would imply (I assume) that I can easily point back to using spark 1.6.3 by specifying:&lt;/P&gt;&lt;PRE&gt;oozie.action.sharelib.for.spark=spark&lt;/PRE&gt;&lt;P&gt;Is my assumption correct?  &lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2017 20:40:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210284#M172226</guid>
      <dc:creator>tshiels</dc:creator>
      <dc:date>2017-06-16T20:40:57Z</dc:date>
    </item>
    <item>
      <title>Re: Choose Spark version from Oozie job (1.6 vs 2.0)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210285#M172227</link>
      <description>&lt;P&gt;Ok, just to update, I followed the directions explicitly in the link provided by dsun (&lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html"&gt;here&lt;/A&gt;).  Using HDP 2.6 and Oozie 4.2 this is failing due to a known bug (&lt;A href="https://issues.apache.org/jira/browse/OOZIE-2787"&gt;jira&lt;/A&gt;).  Basically what works with Spark 1.6 will not work with Spark 2.1 (via oozie anyway) due to a change in how Spark handles multiple files found in distributed cache (&lt;A href="https://issues.apache.org/jira/browse/SPARK-18099"&gt;see here&lt;/A&gt;).  &lt;/P&gt;&lt;PRE&gt;java.lang.IllegalArgumentException: Attempt to add (hdfs://hdpcluster/user/oozie/share/lib/lib_20170411215324/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.&lt;/PRE&gt;&lt;P&gt;I've tried removing multiple files, but there are so many (and even some duplicated in oozie sharelib and spark2 sharelib) that I'm afraid of removing them all and breaking 1.6 (thus removing ability to run any existing jobs under 1.6).&lt;/P&gt;&lt;P&gt;Looks like it may be fixed in Oozie 4.3, but not sure how to update just Oozie service using Ambari (maybe I'll post another question for this).   &lt;/P&gt;&lt;P&gt;&lt;EM&gt;EDIT&lt;/EM&gt;: &lt;/P&gt;&lt;P&gt;After removing all duplicate files found between the sharelib for oozie and spark2, I still could not run a Spark2 job from Oozie 4.2.  Was getting ImportError for a custom python file I was trying to import from the main application py file.  Seems that Oozie wasn't setting --py-files correctly (again, worked fine with Spark 1.6).  &lt;/P&gt;&lt;P&gt;In conclusion, this is only experimental at best.  Hopefully the next version of HDP will use the latest Oozie 4.3.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2017 23:38:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Choose-Spark-version-from-Oozie-job-1-6-vs-2-0/m-p/210285#M172227</guid>
      <dc:creator>tshiels</dc:creator>
      <dc:date>2017-06-16T23:38:38Z</dc:date>
    </item>
  </channel>
</rss>

