<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Oozie Spark Action on Yarn - HADOOP_CONF_DIR or YARN_CONF_DIR in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45310#M41180</link>
    <description>You cannot run Spark on MR1 clusters. You will need a YARN cluster setup&lt;BR /&gt;first, and Oozie switched over to that, before you can attempt the Spark&lt;BR /&gt;action.&lt;BR /&gt;&lt;BR /&gt;To migrate to YARN, please follow&lt;BR /&gt;&lt;A href="https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_mr_and_yarn.html#xd_583c10bfdbd326ba--6eed2fb8-14349d04bee--7f23__section_dtc_lwx_yq" target="_blank"&gt;https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_mr_and_yarn.html#xd_583c10bfdbd326ba--6eed2fb8-14349d04bee--7f23__section_dtc_lwx_yq&lt;/A&gt;</description>
    <pubDate>Tue, 20 Sep 2016 13:47:32 GMT</pubDate>
    <dc:creator>Harsh J</dc:creator>
    <dc:date>2016-09-20T13:47:32Z</dc:date>
    <item>
      <title>Oozie Spark Action on Yarn - HADOOP_CONF_DIR or YARN_CONF_DIR?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45309#M41179</link>
      <description>&lt;P&gt;Hello, I'm currently learning to use Spark Action with Oozie using CDH 5.8.&lt;/P&gt;&lt;P&gt;I'm running the workflow fine with master=local[*] and mode=client. However, it's seems very different with Yarn Client/Cluster. When I run the job, I got:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;2016-09-20 06:04:14,028 WARN org.apache.oozie.action.hadoop.SparkActionExecutor: SERVER[master.meshiang] USER[root] GROUP[-] TOKEN[] APP[CSV] JOB[0000007-160920052847518-oozie-oozi-W] ACTION[0000007-160920052847518-oozie-oozi-W@spark-2bab] Launcher exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
java.lang.Exception: When running with master 'yarn-client' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
	at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:251)
	at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:228)
	at org.apache.spark.deploy.SparkSubmitArguments.&amp;lt;init&amp;gt;(SparkSubmitArguments.scala:109)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:114)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:256)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:207)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:52)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:236)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)&lt;/PRE&gt;&lt;P&gt;I know I have to specify HADOOP_CONFIG_DIR and YARN_CONFIG_DIR. &lt;STRONG&gt;But How and Where?&lt;/STRONG&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What I already tried:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Following the &lt;STRONG&gt;spark-opt&lt;/STRONG&gt; onfiguration from : &lt;A href="http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SparkActionExtension.html#Spark_on_YARN" target="_blank"&gt;http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SparkActionExtension.html#Spark_on_YARN&lt;/A&gt;. In the Spark Action &amp;gt; Options tab in Hue, I put the following configuration:&lt;BR /&gt;&lt;BR /&gt;&lt;PRE&gt;--conf spark.yarn.historyServer.address=http://datanode1.meshiang:18088
--conf spark.eventLog.dir=${nameNode}/user/spark/applicationHistory
--conf spark.eventLog.enabled=true&lt;/PRE&gt;&lt;EM&gt;I don't know if this seems neccesary when this feature is already included in &lt;A href="https://archive.cloudera.com/cdh5/cdh/5/oozie-4.1.0-cdh5.7.2.releasenotes.html" target="_self"&gt;CDH 5.7.2 [OOZIE-2170]&lt;/A&gt;&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;Specifying HADOOP_CONFIG_DIR and YARN_CONFIG_DIR at the oozie server node using&lt;BR /&gt;&lt;BR /&gt;&lt;PRE&gt;export HADOOP_CONFIG_DIR=/etc/hadoop/conf
export YARN_CONFIG_DIR=/etc/hadoop/conf&lt;/PRE&gt;&lt;/LI&gt;&lt;LI&gt;Specifying HADOOP_CONFIG_DIR and YARN_CONFIG_DIR in the Spark Action spark-opts&lt;BR /&gt;&lt;BR /&gt;&lt;PRE&gt;--conf spark.yarn.appMasterEnv.HADOOP_CONFIG_DIR=/etc/hadoop/conf
--conf spark.yarn.appMasterEnv.YARN_CONFIG_DIR=/etc/hadoop/conf&lt;/PRE&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PS : I'm using the Oozie, Spark and MRv1 (for running Oozie Launcher) from CDH 5.8 without changing any of its specification.&lt;/P&gt;&lt;P&gt;_&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:40:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45309#M41179</guid>
      <dc:creator>Stefanie</dc:creator>
      <dc:date>2022-09-16T10:40:11Z</dc:date>
    </item>
    <item>
      <title>Re: Oozie Spark Action on Yarn - HADOOP_CONF_DIR or YARN_CONF_DIR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45310#M41180</link>
      <description>You cannot run Spark on MR1 clusters. You will need a YARN cluster setup&lt;BR /&gt;first, and Oozie switched over to that, before you can attempt the Spark&lt;BR /&gt;action.&lt;BR /&gt;&lt;BR /&gt;To migrate to YARN, please follow&lt;BR /&gt;&lt;A href="https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_mr_and_yarn.html#xd_583c10bfdbd326ba--6eed2fb8-14349d04bee--7f23__section_dtc_lwx_yq" target="_blank"&gt;https://www.cloudera.com/documentation/enterprise/latest/topics/cm_mc_mr_and_yarn.html#xd_583c10bfdbd326ba--6eed2fb8-14349d04bee--7f23__section_dtc_lwx_yq&lt;/A&gt;</description>
      <pubDate>Tue, 20 Sep 2016 13:47:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45310#M41180</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-09-20T13:47:32Z</dc:date>
    </item>
    <item>
      <title>Re: Oozie Spark Action on Yarn - HADOOP_CONF_DIR or YARN_CONF_DIR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45311#M41181</link>
      <description>&lt;P&gt;Thank you for the response!&lt;/P&gt;&lt;P&gt;I'm sorry I forgot to specify that I already have Yarn on my cluster. I'm running the spark job fine using the spark-submit --master yarn --deploy-mode cluster via terminal. However when I run an oozie workflow on it, oozie failed with the error above.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do you mean that I need to move my Oozie Launcher to use MRv2 / Yarn?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Sep 2016 13:54:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45311#M41181</guid>
      <dc:creator>Stefanie</dc:creator>
      <dc:date>2016-09-20T13:54:42Z</dc:date>
    </item>
    <item>
      <title>Re: Oozie Spark Action on Yarn - HADOOP_CONF_DIR or YARN_CONF_DIR</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45312#M41182</link>
      <description>Yes, you need to switch Oozie to submit over YARN and not MRv1. The&lt;BR /&gt;switching guide covers this aspect.&lt;BR /&gt;</description>
      <pubDate>Tue, 20 Sep 2016 13:55:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Oozie-Spark-Action-on-Yarn-HADOOP-CONF-DIR-or-YARN-CONF-DIR/m-p/45312#M41182</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-09-20T13:55:32Z</dc:date>
    </item>
  </channel>
</rss>

