Community Articles
Find and share helpful community-sourced technical articles
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.
Labels (3)
Master Collaborator


If you'd like to use hue to configure an oozie workflow which consists a mapreduce job and are having a hard time figuring out where the arguments go. 


For example,  to run the famous wordcount jar with a twist that is having to use a date variable which you will define in a coordinator.


$ bin/hadoop jar /usr/joe/wordcount.jar org.myorg.WordCount /usr/joe/wordcount/input/${date} /usr/joe/wordcount/output/${date}


From Hue-Oozie, it is only obvious to where the jar file is defined, but how about: 


- classname

- input

- output


How do you specify these pieces?




Applies To


CDH 4, CDH 5, Hue, Oozie







Here is an example if you are using a Driver class:



  • Pull the Source code for the new PiEstimator and compile with Maven. Requires Git, Maven and Java:
  • git clone
  • cd oozie_pi_load_test/PiEstimatorKrbSrc
  • vi pom.xml
  • set hadoop-core and hadoop-client to match your version.
  • mvm clean install
  • Copy oozie_pi_load_test/PiEstimatorKrbSrc/target/PiEstimatorKrb-1.0.jar to some location in HDFS. Make sure it's readable by whichever Hue user will run the workflow.
  • Go to Hue browser and go to the Oozie app
  • Go to the Workflows tab
  • Click "Create"
  • Enter a name and description
  • Click Save
  • Drag "Java" from the actions above to the slot between "Start" and "end"
  • Give it a name and description
  • For the Jar name, click the browse button
  • Find the PiEstimatorKrb-1.0.jar file you put in HDFS
  • For "Main Class" enter "com.test.PiEstimatorKrb"
  • For "Arguments" enter "<tempdir> <nMaps> <nSamples>" by replacing those with correct values. For example "/user/cconner/pi_temp 4 1000", base the nMaps and nSamples on what you would normally use for the Pi example.
  • Click "add path" next to "Files" and search for PiEstimatorKrb-1.0.jar in HDFS.
  • Click Done.
  • Click Save.
  • Click Submit on the left.



Here is an example not using a driver class:



Put /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar in HDFS somewhere, I put it in /user/oozie and make it readable by everyone. 
Create a directory in HDFS for the job. I did "hadoop fs -mkdir teragen_oozie" 
Create an empty input directory in HDFS for the job. I did "hadoop fs -mkdir teragen_oozie/input" 
Go into Hue->Oozie and click Create workflow. 
Enter Name, Description, "HDFS deployment directory" and set it to the location above 
Click Save 
Click + button for Mapreduce 
Enter a name for the MR task 
For Jar name, browse to the location where you put hadoop-mapreduce-examples.jar above 
Click "Add Property" for Job Properties and add the following: 
mapred.input.dir = hdfs:// 
mapred.output.dir = hdfs:// 
mapred.mapper.class = org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper 
terasort.num-rows = 500 
Click "Add delete" for Prepare and specify "hdfs://" as the location. 
Click Save. 
Now run the Workflow and it should succeed 
NOTE: change to be the correct NN for your environment.



BE AWARE that if you get a "classNotFound" exception for the jar class, you are probably hitting HUE-1680

"Current workaround, don't use a full path for the jar file or don't put the 'Jar name' into lib, just put it one level up:"

This is fixed in CDH5


0 Kudos
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
3 of 3
Last update:
‎08-26-2015 02:31 PM
Updated by:
Top Kudoed Authors