When I'm using Oozie, I tend to use Hue to submit coordinators and workflows because it's simple. As a result, I keep my job.properties files on HDFS, in the same folder of its aligned workflow.xml or coordinator.xml. I read in books and other reference material that the job.properties file should be on the local file system. This makes sense if I issue through the command-line like so:
oozie job -run -config job.properties
So, knowing I and others prefer interfacing through Hue for simplicity, which is the better practice, or is it completely preferential? (note that my workflow.xml and coordinator.xml files are on HDFS no matter what)
(Option 1) I ignore the reference material in favor of keeping the job.properties in HDFS so that I can continue submitting and managing through Hue
(Option 2) Keep the job.properties file on the local file system and issue through the command-line.
Appreciate the dialog!
Thank you for the explanation Harsh J. You hit on points that I'm sure others will find valuable along the way as well.
From your response, I gather I have options and it comes down to preference, which is what I was hoping for. :)
Your answer is very confusing.
Is there a single example of using a job.properties file to provide the --username and --password information needed by a jdbc:sqlserver:..... driver?
Do I need to add Arguments? I can add arguments for the entire Sqoop action command within my Hue..Oozie workflow editor and it works with HARDCODED username and password values but I don't want hardcoded values.
How about a screenshot showing the Sqoop Command arguments and a separate screenshot showing the Properties page? Does anyone at Cloudera have one of these that they can share with the community? It would make the trial and error, brute force method of trying all possible combinations of everything much easier.
Hi Harsh J,
I'm trying to update job.properties file for my coordinator after few changes in job.properties which is in HDFS. How do I make my new job.properties file is read by coordinator when it is execute next time?
Do I need to do anyting at CLI?
Condition: My job.properties file should be in HDFS location where workflow.xml & coordinator.xml files present.
From what I recall, the job.properties is read in only on initial submission, so any subsequent changes will not take effect. I believe killing and re-submitting the job is what you'll need to do. This is in contrast to changing something like a workflow.xml, which is read in anew during every execution.
Your job.properties serves future launches and very hand when any of your cluster parameters changes due to upgrades or other issues. It is not meant for launching only but can be used for testing as well if you configure your files appropriately.
It is good practice to keep the job.properties file in the local and also to parametice as many variables in your hdfs files as possible within the job.properties file so that when you need to edit changes on your hdfs files there is no need to "tamper" with hdfs files you just edit in your job.properties and restart the Oozie workflow. The tradition of keeping the job.properties in HDFS usually will bite you when you need to restart your jobs. Restarting an unedited job.properties means the Oozie workflow begins from the previous day you set it up and that could be months or years past.