Support Questions

Find answers, ask questions, and share your expertise

Oozie job.properties - Local or HDFS?

avatar
Expert Contributor

When I'm using Oozie, I tend to use Hue to submit coordinators and workflows because it's simple.  As a result, I keep my job.properties files on HDFS, in the same folder of its aligned workflow.xml or coordinator.xml.  I read in books and other reference material that the job.properties file should be on the local file system.  This makes sense if I issue through the command-line like so:

oozie   job -run -config  job.properties 

 

So, knowing I and others prefer interfacing through Hue for simplicity, which is the better practice, or is it completely preferential? (note that my workflow.xml and coordinator.xml files are on HDFS no matter what)

 

(Option 1) I ignore the reference material in favor of keeping the job.properties in HDFS so that I can continue submitting and managing through Hue

(Option 2) Keep the job.properties file on the local file system and issue through the command-line.

 

Appreciate the dialog!

1 ACCEPTED SOLUTION

avatar
Mentor
The point about having the job.properties on the local filesystem is just
within the context of using Oozie CLI - it is explicitly mentioned to avoid
confusion in the process of writing a workflow XML (the XML needs to be on
HDFS, but the properties are a local file read by the Oozie CLI when
invoked - people often get confused by this relationship).

When you use Hue, which in turn simply uses the Oozie REST API, you can use
whatever mechanism Hue offers you to manage your job.properties (as a HDFS
file, defined within-workflow, etc.).

The properties are something used to resolve the workflow's variables and
supply it some necessary parameters. Once submitted, the properties file or
list of properties do not matter anymore to Oozie.

View solution in original post

9 REPLIES 9

avatar
Mentor
The point about having the job.properties on the local filesystem is just
within the context of using Oozie CLI - it is explicitly mentioned to avoid
confusion in the process of writing a workflow XML (the XML needs to be on
HDFS, but the properties are a local file read by the Oozie CLI when
invoked - people often get confused by this relationship).

When you use Hue, which in turn simply uses the Oozie REST API, you can use
whatever mechanism Hue offers you to manage your job.properties (as a HDFS
file, defined within-workflow, etc.).

The properties are something used to resolve the workflow's variables and
supply it some necessary parameters. Once submitted, the properties file or
list of properties do not matter anymore to Oozie.

avatar
Expert Contributor

Thank you for the explanation Harsh J.  You hit on points that I'm sure others will find valuable along the way as well. 

 

From your response, I gather I have options and it comes down to preference, which is what I was hoping for. 🙂

avatar
Contributor

Your answer is very confusing.  

 

Is there a single example of using a job.properties file to provide the --username and --password information needed by a jdbc:sqlserver:..... driver?  

 

Do I need to add Arguments?  I can add arguments for the entire Sqoop action command within my Hue..Oozie workflow editor and it works with HARDCODED username and password values but I don't want hardcoded values.

 

How about a screenshot showing the Sqoop Command arguments and a separate screenshot showing the Properties page?  Does anyone at Cloudera have one of these that they can share with the community?  It would make the trial and error, brute force method of trying all possible combinations of everything much easier.

 

Thanks.

avatar
Expert Contributor
You would probably be better served by asking this question in a separate thread.

avatar
New Contributor

Hi Harsh J,

 

I'm trying to update job.properties file for my coordinator after few changes in job.properties which is in HDFS. How do I make my new job.properties file is read by coordinator when it is execute next time? 

 

Do I need to do anyting at CLI?

 

Condition: My job.properties file should be in HDFS location where workflow.xml & coordinator.xml files present.

Thanks
Venkat

 

 

avatar
Expert Contributor

From what I recall, the job.properties is read in only on initial submission, so any subsequent changes will not take effect.  I believe killing and re-submitting the job is what you'll need to do.  This is in contrast to changing something like a workflow.xml, which is read in anew during every execution.

avatar
New Contributor
Thanks tseader. If we kill or resubmit the coordinator, we will lose history/lineage of that job.Correct?
Is there any idea how to update job.properties without kill or re-submission?
Thanks
Venkat

avatar
New Contributor

Your job.properties serves future launches and very hand when any of your cluster parameters changes due to upgrades or other issues. It is not meant for launching only but can be used for testing as well if you configure your files appropriately. 

avatar
New Contributor

It is good practice to keep the job.properties file in the local and also to parametice as many variables in your hdfs files as possible within the job.properties file so that when you need to edit changes on your hdfs files there is no need to "tamper" with hdfs files you just edit in your job.properties and restart the Oozie workflow. The tradition of keeping the job.properties in HDFS usually will bite you when you need to restart your jobs. Restarting an unedited job.properties means the Oozie workflow begins from the previous day you set it up and that could be months or years past.