<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to run Spark job from Oozie Workflow on HDP/hue in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100207#M63226</link>
    <description>&lt;P&gt;I figured it out by myself. Here is the steps:&lt;/P&gt;&lt;P&gt;1: download sandbox or use your existing sandbox (HDP 2.3.2)&lt;/P&gt;&lt;P&gt;2: create a workflow on Hue's oozie&lt;/P&gt;&lt;P&gt;3: click "Edit Properties" and add a property in Oozie parameters: oozie.action.sharelib.for.spark = spark,hcatalog,hive&lt;/P&gt;&lt;P&gt;4: click Save button&lt;/P&gt;&lt;P&gt;5: add a shell action; fill name  field. shell command field may be required; enter whatever any string temporary and save the shell action. We come back to edit it later.&lt;/P&gt;&lt;P&gt;6: Close workflow and open file browser; click oozie, then workspaces. Identify _hue_xxx directory for the workflow you are creating.&lt;/P&gt;&lt;P&gt;7: create lib directory.&lt;/P&gt;&lt;P&gt;8: copy your jar file that contains spark java program.&lt;/P&gt;&lt;P&gt;9: move up the directory and copy shell file (e.g. script.sh) that contains:&lt;/P&gt;&lt;P&gt;spark-submit --class JDBCTest spark-test-1.0.jar&lt;/P&gt;&lt;P&gt;spark-test-1.0.jar is the file you uploaded to lib directory.&lt;/P&gt;&lt;P&gt;10: Go back to workflow web page&lt;/P&gt;&lt;P&gt;11: Open the shell action and set Shell command by selecting shell file (e.g. script.sh)&lt;/P&gt;&lt;P&gt;12: Also populate Files field to add the schell file (e.g. script.sh) again&lt;/P&gt;&lt;P&gt;13: click Done&lt;/P&gt;&lt;P&gt;14: save the workflow&lt;/P&gt;&lt;P&gt;15: submit the workflow&lt;/P&gt;&lt;P&gt;16: it should run.&lt;/P&gt;&lt;P&gt;My java program does like this:&lt;/P&gt;&lt;P&gt;Statement stmt = con.createStatement();&lt;/P&gt;&lt;P&gt;String sql = "SELECT s07.description AS job_category, s07.salary , s08.salary , (s08.salary - s07.salary) AS salary_difference FROM sample_07 s07 JOIN sample_08 s08 ON ( s07.code = s08.code) WHERE s07.salary &amp;lt; s08.salary SORT BY s08.salary-s07.salary DESC LIMIT 5";&lt;/P&gt;&lt;P&gt;ResultSet res = stmt.executeQuery(sql);&lt;/P&gt;&lt;P&gt;It uses hive jdbc driver.&lt;/P&gt;</description>
    <pubDate>Sat, 30 Jan 2016 04:44:52 GMT</pubDate>
    <dc:creator>shigeru_takehar</dc:creator>
    <dc:date>2016-01-30T04:44:52Z</dc:date>
    <item>
      <title>How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100203#M63222</link>
      <description>&lt;P&gt;I have created a small java program for Spark. It works with "spark-submit" command. I like to run it from Oozie workflow. It seems HDP 2.3 has a capability to run Spark job from Oozie workflow, but on Hue's GUI, I don't have a choice of Spark job to include into a workflow. How do I do? &lt;/P&gt;</description>
      <pubDate>Mon, 21 Dec 2015 11:53:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100203#M63222</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2015-12-21T11:53:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100204#M63223</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/1646/shigerutakehara.html"&gt;Shigeru Takehara&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Please refer this - &lt;A href="https://developer.ibm.com/hadoop/blog/2015/11/05/run-spark-job-yarn-oozie/"&gt;https://developer.ibm.com/hadoop/blog/2015/11/05/r...&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Dec 2015 17:27:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100204#M63223</guid>
      <dc:creator>KuldeepK</dc:creator>
      <dc:date>2015-12-21T17:27:35Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100205#M63224</link>
      <description>&lt;P&gt;There is a bug which requires you to manually copy hive/hcat jars into spark shared lib dir in order to get this to work:&lt;/P&gt;&lt;P&gt;&lt;A href="https://issues.apache.org/jira/browse/OOZIE-2277"&gt;https://issues.apache.org/jira/browse/OOZIE-2277&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 22 Dec 2015 00:38:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100205#M63224</guid>
      <dc:creator>abajwa</dc:creator>
      <dc:date>2015-12-22T00:38:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100206#M63225</link>
      <description>&lt;P&gt;I'm new with HDP/Big Data environment and understand what it is described, but I don't know how I should interpret it into HDP 2.3 environment. Also, I would like to run it from Hue's GUI Oozie workflow editor. Could you explain step by step?&lt;/P&gt;&lt;P&gt;Thanks a lot.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Dec 2015 03:59:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100206#M63225</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2015-12-22T03:59:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100207#M63226</link>
      <description>&lt;P&gt;I figured it out by myself. Here is the steps:&lt;/P&gt;&lt;P&gt;1: download sandbox or use your existing sandbox (HDP 2.3.2)&lt;/P&gt;&lt;P&gt;2: create a workflow on Hue's oozie&lt;/P&gt;&lt;P&gt;3: click "Edit Properties" and add a property in Oozie parameters: oozie.action.sharelib.for.spark = spark,hcatalog,hive&lt;/P&gt;&lt;P&gt;4: click Save button&lt;/P&gt;&lt;P&gt;5: add a shell action; fill name  field. shell command field may be required; enter whatever any string temporary and save the shell action. We come back to edit it later.&lt;/P&gt;&lt;P&gt;6: Close workflow and open file browser; click oozie, then workspaces. Identify _hue_xxx directory for the workflow you are creating.&lt;/P&gt;&lt;P&gt;7: create lib directory.&lt;/P&gt;&lt;P&gt;8: copy your jar file that contains spark java program.&lt;/P&gt;&lt;P&gt;9: move up the directory and copy shell file (e.g. script.sh) that contains:&lt;/P&gt;&lt;P&gt;spark-submit --class JDBCTest spark-test-1.0.jar&lt;/P&gt;&lt;P&gt;spark-test-1.0.jar is the file you uploaded to lib directory.&lt;/P&gt;&lt;P&gt;10: Go back to workflow web page&lt;/P&gt;&lt;P&gt;11: Open the shell action and set Shell command by selecting shell file (e.g. script.sh)&lt;/P&gt;&lt;P&gt;12: Also populate Files field to add the schell file (e.g. script.sh) again&lt;/P&gt;&lt;P&gt;13: click Done&lt;/P&gt;&lt;P&gt;14: save the workflow&lt;/P&gt;&lt;P&gt;15: submit the workflow&lt;/P&gt;&lt;P&gt;16: it should run.&lt;/P&gt;&lt;P&gt;My java program does like this:&lt;/P&gt;&lt;P&gt;Statement stmt = con.createStatement();&lt;/P&gt;&lt;P&gt;String sql = "SELECT s07.description AS job_category, s07.salary , s08.salary , (s08.salary - s07.salary) AS salary_difference FROM sample_07 s07 JOIN sample_08 s08 ON ( s07.code = s08.code) WHERE s07.salary &amp;lt; s08.salary SORT BY s08.salary-s07.salary DESC LIMIT 5";&lt;/P&gt;&lt;P&gt;ResultSet res = stmt.executeQuery(sql);&lt;/P&gt;&lt;P&gt;It uses hive jdbc driver.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Jan 2016 04:44:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100207#M63226</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-01-30T04:44:52Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100208#M63227</link>
      <description>&lt;P&gt;I also tested HiveContext so that Hive processing works in Spark memory. It works.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Jan 2016 06:41:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100208#M63227</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-01-30T06:41:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100209#M63228</link>
      <description>&lt;P&gt;It looks like HDP 2.3.2 already has this patch.&lt;/P&gt;</description>
      <pubDate>Sat, 30 Jan 2016 06:41:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100209#M63228</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-01-30T06:41:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100210#M63229</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/132/abajwa.html" nodeid="132"&gt;@Ali Bajwa&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/1646/shigerutakehara.html" nodeid="1646"&gt;@Shigeru Takehara&lt;/A&gt; when you specify oozie.action.sharelib.for.spark = spark,hcatalog,hive&lt;/P&gt;&lt;P&gt;it will include those libraries with Spark. The trick I learned a hard way :).&lt;/P&gt;</description>
      <pubDate>Mon, 01 Feb 2016 02:00:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100210#M63229</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-01T02:00:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100211#M63230</link>
      <description>&lt;P&gt;I looked that the SparkMain class contained within the &lt;STRONG&gt;oozie-sharelib-spark-4.2.0.2.3.4.1-10.jar&lt;/STRONG&gt; that comes with the Spark 1.6 TP, and it does not appear to have the fix for &lt;A href="https://issues.apache.org/jira/browse/OOZIE-2277"&gt;https://issues.apache.org/jira/browse/OOZIE-2277&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 03 Feb 2016 23:58:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100211#M63230</guid>
      <dc:creator>Craig M</dc:creator>
      <dc:date>2016-02-03T23:58:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100212#M63231</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1598/cmuchinsky.html" nodeid="1598"&gt;@cmuchinsky&lt;/A&gt; Spark Oozie action is not supported in HDP at this moment. It is explicitly stated in our Spark User guide.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Feb 2016 23:59:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100212#M63231</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-03T23:59:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100213#M63232</link>
      <description>&lt;P&gt;Understood &lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt;, however your previous comment seems to indicate you have some knowledge of the Oozie 'oozie.action.sharelib.for.spark' property, so I wanted to clear up that the comment by &lt;A rel="user" href="https://community.cloudera.com/users/1646/shigerutakehara.html" nodeid="1646"&gt;@Shigeru Takehara&lt;/A&gt; indicating OOZIE-227 was fixed doesn't seem to jibe with the HDP 2.3.4 or 2.3.4.1-TP deliverables.&lt;/P&gt;&lt;P&gt;While Spark via Oozie isn't officially supported, the Hortonworks Support team had provided us with a procedure to update the Oozie sharelib for Spark to get it working with 2.3.4, however that no longer seems to work with the Spark 1.6 enabled 2.3.4.1-TP version.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 00:18:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100213#M63232</guid>
      <dc:creator>Craig M</dc:creator>
      <dc:date>2016-02-04T00:18:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100214#M63233</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1598/cmuchinsky.html" nodeid="1598"&gt;@cmuchinsky&lt;/A&gt; I would love to see the steps engineering provided and in general, just because we don't officially support it, doesn't mean it cannot be done. It just means sometimes you have to dig deeper and with Oozie, I have limited patience :).&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 00:21:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100214#M63233</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-04T00:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100215#M63234</link>
      <description>&lt;P&gt;For your review &lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://na9.salesforce.com/articles/en_US/How_To/How-to-run-Spark-Action-in-oozie-of-HDP-2-3-0?popup=true"&gt;https://na9.salesforce.com/articles/en_US/How_To/How-to-run-Spark-Action-in-oozie-of-HDP-2-3-0?popup=true&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;&lt;/A&gt;While I was not able to see the contents of this link directly (had to have somebody extract for me), perhaps you can as a Hortonworks insider.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 00:40:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100215#M63234</guid>
      <dc:creator>Craig M</dc:creator>
      <dc:date>2016-02-04T00:40:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to run Spark job from Oozie Workflow on HDP/hue</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100216#M63235</link>
      <description>&lt;P&gt;awesome! I will try this out and maybe publish an article &lt;A rel="user" href="https://community.cloudera.com/users/1598/cmuchinsky.html" nodeid="1598"&gt;@cmuchinsky&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Feb 2016 00:45:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-run-Spark-job-from-Oozie-Workflow-on-HDP-hue/m-p/100216#M63235</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-04T00:45:28Z</dc:date>
    </item>
  </channel>
</rss>

