Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Externalize properties for Oozie workflows

avatar
Contributor

Hi,

we are developing dozens of different workflows in Oozie. We use 3 environments, DEV, PRE and PRO. 

 

What I'd like to do is to keep same workflow.xml files in all 3 environments and only read properties for db connections, urls, authentications etc from the Cloudera environment. In this way, it would be possible to version control the files and deploy them easily. Is it possible? 

 

Googling, I saw maybe config-default.xml would be a place for default property values, but I cannot find any in our Cloudera 5.10. We use Hue to design the workflows, algthough we can use other way to define WFs. 

 

Thanks!

1 ACCEPTED SOLUTION

avatar
Contributor

I finally solved this issue by using config-default.xml located in the workspace of every workflow, I put all variables in there. 

If I want to update all values, I have a script that updates all directories. 

View solution in original post

8 REPLIES 8

avatar
Super Guru
Hi,

I think it should be doable, though not through Hue.

You can try to put the environment specific variables into job.properties file and pass them into the workflow, so that the workflow is generic and can be used in other environments.

Hue will hard code things like credentials into the workflow if you are in secured environment, so it won't work. You might have to try to manually create those workflows and submit jobs from command line.

avatar
Contributor

Hi Eric,

thanks for the reply. I think I saw the cmd line example you refer to, is it this?  https://oozie.apache.org/docs/4.0.0/DG_Examples.html

The thing is that our client wants to have some graphical interface like Hue, so they could control (including launching) any job in the datalake.

Maybe we could edit externally job.properties in the WF's workspace? Actually I've tried to edit the contents of the file, but Hue ignores the new values - when submitting, a popup opens with all the properties with values unchanged. Maybe I'm doing something wrong?

avatar
Super Guru
You can't modify workflow setting outside of Hue, including job.properties and workflow.xml, as Hue will re-generate them when you submit the workflow again.

If you want to stick with Hue, I do not think it will work. But let me check for you if Hue has such feature on the road map.

avatar
Contributor

Actually I edited the job.properties with Hue's file browser. But those original properties has to be stored somewhere else (in Hue's memory?) because they remain unchanged. I even restarted Oozie and Hue and reopened the WF, but the Hue still did not took into account new values of the job.properties.

Thanks.

avatar
Contributor

I finally solved this issue by using config-default.xml located in the workspace of every workflow, I put all variables in there. 

If I want to update all values, I have a script that updates all directories. 

avatar
New Contributor

Hi,

 

How are you making config-default.xml available in the workspace of every workflow? For ex: I'm creating a new oozie workflow from HUE. I know job.properties and workflow.xml are (re)created only when the job is submitted or re-run. How are you creating/copying the config-default.xml for all the new workflows those are getting created?

 

Thanks

avatar
Contributor

Hi Nnr, I use this command to copy/overwrite the file located in local path /tmp/config-default.xml:

 

hdfs dfs -ls -C /user/hue/oozie/workspaces/ | grep hue-oozie- | xargs -I % sh -c 'hdfs dfs -put -f /tmp/config-default.xml %' 

 

Use at your own risk 😉

 

It is a pity that hdfs does not implement symlinks, it would be much maintainable.

 

Regards

avatar
New Contributor

HI elkarel,

 

Thanks for your quick reply. The script is useful for me as well. But, its pity that we dont have option to copy the default files to the workspace when the workspace is created.

 

We should have a configuration in hue.ini (like remote_data_dir) to copy the default contents once the workspace directory is created.

 

Thanks