- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Oozie job.properties - Local or HDFS?
- Labels:
-
Apache Oozie
Created on ‎03-29-2016 08:28 AM - edited ‎09-16-2022 03:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I'm using Oozie, I tend to use Hue to submit coordinators and workflows because it's simple. As a result, I keep my job.properties files on HDFS, in the same folder of its aligned workflow.xml or coordinator.xml. I read in books and other reference material that the job.properties file should be on the local file system. This makes sense if I issue through the command-line like so:
oozie job -run -config job.properties
So, knowing I and others prefer interfacing through Hue for simplicity, which is the better practice, or is it completely preferential? (note that my workflow.xml and coordinator.xml files are on HDFS no matter what)
(Option 1) I ignore the reference material in favor of keeping the job.properties in HDFS so that I can continue submitting and managing through Hue
(Option 2) Keep the job.properties file on the local file system and issue through the command-line.
Appreciate the dialog!
Created ‎03-29-2016 08:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
within the context of using Oozie CLI - it is explicitly mentioned to avoid
confusion in the process of writing a workflow XML (the XML needs to be on
HDFS, but the properties are a local file read by the Oozie CLI when
invoked - people often get confused by this relationship).
When you use Hue, which in turn simply uses the Oozie REST API, you can use
whatever mechanism Hue offers you to manage your job.properties (as a HDFS
file, defined within-workflow, etc.).
The properties are something used to resolve the workflow's variables and
supply it some necessary parameters. Once submitted, the properties file or
list of properties do not matter anymore to Oozie.
Created ‎03-29-2016 08:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
within the context of using Oozie CLI - it is explicitly mentioned to avoid
confusion in the process of writing a workflow XML (the XML needs to be on
HDFS, but the properties are a local file read by the Oozie CLI when
invoked - people often get confused by this relationship).
When you use Hue, which in turn simply uses the Oozie REST API, you can use
whatever mechanism Hue offers you to manage your job.properties (as a HDFS
file, defined within-workflow, etc.).
The properties are something used to resolve the workflow's variables and
supply it some necessary parameters. Once submitted, the properties file or
list of properties do not matter anymore to Oozie.
Created ‎03-29-2016 09:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the explanation Harsh J. You hit on points that I'm sure others will find valuable along the way as well.
From your response, I gather I have options and it comes down to preference, which is what I was hoping for. 🙂
Created ‎08-02-2017 01:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your answer is very confusing.
Is there a single example of using a job.properties file to provide the --username and --password information needed by a jdbc:sqlserver:..... driver?
Do I need to add Arguments? I can add arguments for the entire Sqoop action command within my Hue..Oozie workflow editor and it works with HARDCODED username and password values but I don't want hardcoded values.
How about a screenshot showing the Sqoop Command arguments and a separate screenshot showing the Properties page? Does anyone at Cloudera have one of these that they can share with the community? It would make the trial and error, brute force method of trying all possible combinations of everything much easier.
Thanks.
Created ‎07-19-2018 11:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎07-19-2018 10:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Harsh J,
I'm trying to update job.properties file for my coordinator after few changes in job.properties which is in HDFS. How do I make my new job.properties file is read by coordinator when it is execute next time?
Do I need to do anyting at CLI?
Condition: My job.properties file should be in HDFS location where workflow.xml & coordinator.xml files present.
Thanks
Venkat
Created ‎07-19-2018 10:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From what I recall, the job.properties is read in only on initial submission, so any subsequent changes will not take effect. I believe killing and re-submitting the job is what you'll need to do. This is in contrast to changing something like a workflow.xml, which is read in anew during every execution.
Created ‎07-19-2018 11:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is there any idea how to update job.properties without kill or re-submission?
Thanks
Venkat
Created ‎03-18-2019 01:26 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your job.properties serves future launches and very hand when any of your cluster parameters changes due to upgrades or other issues. It is not meant for launching only but can be used for testing as well if you configure your files appropriately.
Created ‎03-18-2019 01:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is good practice to keep the job.properties file in the local and also to parametice as many variables in your hdfs files as possible within the job.properties file so that when you need to edit changes on your hdfs files there is no need to "tamper" with hdfs files you just edit in your job.properties and restart the Oozie workflow. The tradition of keeping the job.properties in HDFS usually will bite you when you need to restart your jobs. Restarting an unedited job.properties means the Oozie workflow begins from the previous day you set it up and that could be months or years past.
