Posts: 48
Registered: ‎08-05-2015
Accepted Solution

Oozie - Local or HDFS?

[ Edited ]

When I'm using Oozie, I tend to use Hue to submit coordinators and workflows because it's simple.  As a result, I keep my files on HDFS, in the same folder of its aligned workflow.xml or coordinator.xml.  I read in books and other reference material that the file should be on the local file system.  This makes sense if I issue through the command-line like so:

oozie   job -run -config 


So, knowing I and others prefer interfacing through Hue for simplicity, which is the better practice, or is it completely preferential? (note that my workflow.xml and coordinator.xml files are on HDFS no matter what)


(Option 1) I ignore the reference material in favor of keeping the in HDFS so that I can continue submitting and managing through Hue

(Option 2) Keep the file on the local file system and issue through the command-line.


Appreciate the dialog!

Posts: 1,613
Kudos: 303
Solutions: 245
Registered: ‎07-31-2013

Re: Oozie - Local or HDFS

The point about having the on the local filesystem is just
within the context of using Oozie CLI - it is explicitly mentioned to avoid
confusion in the process of writing a workflow XML (the XML needs to be on
HDFS, but the properties are a local file read by the Oozie CLI when
invoked - people often get confused by this relationship).

When you use Hue, which in turn simply uses the Oozie REST API, you can use
whatever mechanism Hue offers you to manage your (as a HDFS
file, defined within-workflow, etc.).

The properties are something used to resolve the workflow's variables and
supply it some necessary parameters. Once submitted, the properties file or
list of properties do not matter anymore to Oozie.
Posts: 48
Registered: ‎08-05-2015

Re: Oozie - Local or HDFS

Thank you for the explanation Harsh J.  You hit on points that I'm sure others will find valuable along the way as well. 


From your response, I gather I have options and it comes down to preference, which is what I was hoping for. :)

Posts: 10
Registered: ‎04-05-2017

Re: Oozie - Local or HDFS

Your answer is very confusing.  


Is there a single example of using a file to provide the --username and --password information needed by a jdbc:sqlserver:..... driver?  


Do I need to add Arguments?  I can add arguments for the entire Sqoop action command within my Hue..Oozie workflow editor and it works with HARDCODED username and password values but I don't want hardcoded values.


How about a screenshot showing the Sqoop Command arguments and a separate screenshot showing the Properties page?  Does anyone at Cloudera have one of these that they can share with the community?  It would make the trial and error, brute force method of trying all possible combinations of everything much easier.