<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: running/scheduling an Oozie job(s) with mapreduce scripts(written in Python) in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117288#M80071</link>
    <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/76767/runningscheduling-an-oozie-jobs-with-mapreduce-scr.html#"&gt;@justlearning&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Oozie can't do mapreduce by itself, it's a Hadoop scheduler which launch workflows composed of jobs, which can be mapreduce.&lt;/P&gt;&lt;P&gt;You here want to run a job defined by workflow.xml with parameters in job.properties, so the syntax is&lt;/P&gt;&lt;P&gt;oozie job --oozie &lt;A href="http://sandbox.hortonworks.com:11000/oozie" target="_blank"&gt;http://sandbox.hortonworks.com:11000/oozie&lt;/A&gt; -config job.properties -run&lt;/P&gt;</description>
    <pubDate>Sat, 07 Jan 2017 05:41:34 GMT</pubDate>
    <dc:creator>ledel</dc:creator>
    <dc:date>2017-01-07T05:41:34Z</dc:date>
    <item>
      <title>running/scheduling an Oozie job(s) with mapreduce scripts(written in Python)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117287#M80070</link>
      <description>&lt;P&gt;I think i am facing a configuration issue . here is the  &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/11192-workflow.xml"&gt;workflow.xml&lt;/A&gt; file I am using in attempt to run/submit the the Oozie job with mapreduce scripts.
I am using command :  $ oozie mapreduce -oozie &lt;A href="http://localhost:11000/oozie" target="_blank"&gt;http://localhost:11000/oozie&lt;/A&gt; -config job.properties&lt;/P&gt;&lt;P&gt;according to :&lt;/P&gt;&lt;P&gt;&lt;A href="http://oozie.apache.org/docs/4.1.0/DG_CommandLineTool.html#Oozie_Command_Line_Usage" target="_blank"&gt;http://oozie.apache.org/docs/4.1.0/DG_CommandLineTool.html#Oozie_Command_Line_Usage&lt;/A&gt;&lt;/P&gt;&lt;P&gt;
"the parameters must be in the Java Properties file (.properties). This file must be specified for a map-reduce job.
The properties file must specify the mapred.mapper.class
, mapred.reducer.class
, mapred.input.dir
, mapred.output.dir
,
=oozie.libpath=, mapred.job.tracker
, and fs.default.name
 properties."&lt;/P&gt;&lt;P&gt;The map-reduce job will be created and submitted. All jar files and 
all other files needed by the mapreduce job need to be uploaded onto 
HDFS under libpath beforehand. The workflow.xml will be created in Oozie
 server internally. Users can get the workflow.xml from console or 
command line(-definition).&lt;/P&gt;&lt;P&gt;However i am getting this error &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/11193-mapreduce-oozie-error.png"&gt;mapreduce-oozie-error.png&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I am not sure how to configure my workflow.xml file or the procedure to successfully  execute an Oozie job with with mapreduce script written in Python.
  &lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sat, 07 Jan 2017 03:32:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117287#M80070</guid>
      <dc:creator>axel_robinson</dc:creator>
      <dc:date>2017-01-07T03:32:34Z</dc:date>
    </item>
    <item>
      <title>Re: running/scheduling an Oozie job(s) with mapreduce scripts(written in Python)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117288#M80071</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/76767/runningscheduling-an-oozie-jobs-with-mapreduce-scr.html#"&gt;@justlearning&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Oozie can't do mapreduce by itself, it's a Hadoop scheduler which launch workflows composed of jobs, which can be mapreduce.&lt;/P&gt;&lt;P&gt;You here want to run a job defined by workflow.xml with parameters in job.properties, so the syntax is&lt;/P&gt;&lt;P&gt;oozie job --oozie &lt;A href="http://sandbox.hortonworks.com:11000/oozie" target="_blank"&gt;http://sandbox.hortonworks.com:11000/oozie&lt;/A&gt; -config job.properties -run&lt;/P&gt;</description>
      <pubDate>Sat, 07 Jan 2017 05:41:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117288#M80071</guid>
      <dc:creator>ledel</dc:creator>
      <dc:date>2017-01-07T05:41:34Z</dc:date>
    </item>
    <item>
      <title>Re: running/scheduling an Oozie job(s) with mapreduce scripts(written in Python)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117289#M80072</link>
      <description>&lt;P&gt;Ok , Thank you &lt;A rel="user" href="https://community.cloudera.com/users/133/ledel.html" nodeid="133"&gt;@Laurent Edel
&lt;/A&gt;&lt;/P&gt;&lt;P&gt;how to configure my workflow.xml  &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/11195-workflow.xml"&gt;workflow.xml&lt;/A&gt; file or the procedure to successfully 
execute an Oozie job with with mapreduce script written in Python&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/133/ledel.html" nodeid="133"&gt;
&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sat, 07 Jan 2017 06:02:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117289#M80072</guid>
      <dc:creator>axel_robinson</dc:creator>
      <dc:date>2017-01-07T06:02:38Z</dc:date>
    </item>
    <item>
      <title>Re: running/scheduling an Oozie job(s) with mapreduce scripts(written in Python)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117290#M80073</link>
      <description>&lt;P&gt;You should consider running hadoop streaming using your python mapper and reducer.&lt;/P&gt;&lt;P&gt;Take a look at &lt;A href="https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.2.3_Streaming"&gt;https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.2.3_Streaming&lt;/A&gt; for an example of such that workflow&lt;/P&gt;&lt;P&gt;Try first to execute your streaming directly with something like&lt;/P&gt;&lt;PRE&gt;yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar -files mapper.py,reducer.py -mapper mapper.py -reducer reducer.py -input /user/theuser/input.csv -output /user/theuser/out
&lt;/PRE&gt;&lt;P&gt;Then it'll be easier to schedule that with Oozie, worst case scenario you'll do a shell action with that command&lt;/P&gt;&lt;P&gt;Please accept answer if I answered your question&lt;/P&gt;</description>
      <pubDate>Sat, 07 Jan 2017 06:27:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/running-scheduling-an-Oozie-job-s-with-mapreduce-scripts/m-p/117290#M80073</guid>
      <dc:creator>ledel</dc:creator>
      <dc:date>2017-01-07T06:27:55Z</dc:date>
    </item>
  </channel>
</rss>

