Created 03-17-2016 06:27 AM
I am new to Oozie and trying to understand the significance of the schema URI in oozie workflow. Below is a typical mention of a schema uri.
xmlns="uri:oozie:workflow:0.5"
What exactly does it mean? What does the number (0.5) at the end implies to? How should I get the uri information and from where? I am using CDH 5.3.6 and the Oozie Client Build version is 4.0.0. Please write a detailed explanation on oozie schema if possible.
Much appreciate your help. Thank you.
Created 03-17-2016 06:46 PM
XML has XSD (XML Schema Definition) that defines the schema/structure of the XML document. It describes what elements are allowed, nesting and count and such. While wiring XML that conforms with certain or specific XSD, namespace is specified to point to it.
For Oozie there are version specific XSDs based on new features/elements/attributes/actions they support. Here's the workflow 0.5 XSD that is referred by that xmlns. https://github.com/apache/oozie/blob/master/client/src/main/resources/oozie-workflow-0.5.xsd
When you refer to the said XSD, if means your workflow XMLs will adhere to and use the elements/attributes described by that XSD.
Created 03-17-2016 11:23 AM
My understanding is that you reference aparticular version of oozie xml languagw for workflows. number 0.5 signifies the latest version as of Oozie 4.2.0. With every new version there are Oozie actions that are being added, deprecated, removed and extended. Meaning let's take fs action, here are notes on it from the docs
As of schema 0.4, if a name-node element is specified, then it is not necessary for any of the paths to start with the file system URI as it is taken from the name-node element. This is also true if the name-node is specified in the global section (see Global Configurations )
As of schema 0.4, zero or more job-xml elements can be specified; these must refer to Hadoop JobConf job.xml formatted files bundled in the workflow application. They can be used to set additional properties for the FileSystem instance.
As of schema 0.4, if a configuration element is specified, then it will also be used to set additional JobConf properties for the FileSystem instance. Properties specified in theconfiguration element override properties specified in the files specified by any job-xml elements.
I would not pay too much attention to schema uro, just try using the latest.
Created 03-21-2016 08:42 AM
@Artem Ervits, thanks for your answer.
Created 03-17-2016 06:46 PM
XML has XSD (XML Schema Definition) that defines the schema/structure of the XML document. It describes what elements are allowed, nesting and count and such. While wiring XML that conforms with certain or specific XSD, namespace is specified to point to it.
For Oozie there are version specific XSDs based on new features/elements/attributes/actions they support. Here's the workflow 0.5 XSD that is referred by that xmlns. https://github.com/apache/oozie/blob/master/client/src/main/resources/oozie-workflow-0.5.xsd
When you refer to the said XSD, if means your workflow XMLs will adhere to and use the elements/attributes described by that XSD.
Created 03-21-2016 08:41 AM
Thanks a mil @rpatil . I have understood its significance now. However there is still a question that was unanswered and I hope you can answer it. Where should I get the uri information from? And a new question is, how will I determine what version to use?
Created 03-21-2016 08:48 AM
Hi @Alex Raj. Thanks. I feel I have answered that as well, but let me explain better.
Created 03-21-2016 11:36 AM
Thanks again. Your answers makes me to dig deep and shoot more questions. Consider there is a new release of Oozie say 5.0.2. How will I determine what version to use in the schema URI. Should I look out in the same URL that you have mentioned or is there a better way?
Created 03-21-2016 11:44 AM
As far I understand. Oozie has been careful so far to keep the backward compatibility. Meaning, existing workflows with existing URI work just fine even in newer release without changing anything. One can keep using the older URI just fine even in new release. However, if from the new version, you know some new feature (tag/attribute) has been added and you want to use that, then only you need to refer to that new URI.
Created 03-21-2016 09:49 AM
Just a reminder you will surely get more useful help in here for HDP and not CDH , if a specific details is not answered then hunt down some Cloudera forum they could be more specific