- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Oozie shell action: exec and file tags
- Labels:
-
Apache Oozie
Created ‎01-27-2016 07:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm a newbie in Oozie and I've read some Oozie shell action examples but this got me confused about certain things. There are examples I've seen where there is no <file> tag.
Some example, like in Cloudera here, repeats the shell script in file tag:
<shell xmlns="uri:oozie:shell-action:0.2"> <exec>check-hour.sh</exec> <argument>${earthquakeMinThreshold}</argument> <file>check-hour.sh</file> </shell>
While in Oozie's website, writes the shell script (the reference ${EXEC} from job.properties, which points to script.sh file) twice, separated by #.
<shell xmlns="uri:oozie:shell-action:0.1"> ... <exec>${EXEC}</exec> <argument>A</argument> <argument>B</argument> <file>${EXEC}#${EXEC}</file> </shell>
There are also examples I've seen where the path (HDFS or local?) is prepended before the `script.sh#script.sh` within the <file> tag.
<shell xmlns="uri:oozie:shell-action:0.1"> ... <exec>script.sh</exec> <argument>A</argument> <argument>B</argument> <file>/path/script.sh#script.sh</file> </shell>
As I understand, any shell script file can be included in the workflow HDFS path (same path where workflow.xml resides). Can someone explain the differences in these examples and how `<exec>`, `<file>`, `script.sh#script.sh`, and the `/path/script.sh#script.sh` are used?
Created ‎01-27-2016 09:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK the exec tag executes a shell script in the local working directory of oozie.
For example /hadoop/yarn/.../oozietmp/myscript.sh
You have no idea before which directory this is or on which server it is located. It is in some yarn tmp dir.
The file tag is there to put something into this temp dir. And you can rename the file as well using the # syntax.
So if your shell script is in HDFS in hdfs://tmp/myfolder/myNewScript.sh
But you do not want to change the exec tag for some reason.
You can do
<file>/tmp/myfolder/myNewScript.sh#myscript.sh</file>
And oozie will take the file from HDFS put it into the tmp folder before execution and rename it.
You can use the file tag to upload any kind of files ( like jars or other dependencies )
As far as I can see the ${EXEC} is just a variable they set somewhere with no specific meaning.
Oh last but not least, if you want to avoid the file tag you can also simply put these files into a lib folder in the workflow folder. Oozie will upload all of these files per default.
Created ‎01-27-2016 09:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK the exec tag executes a shell script in the local working directory of oozie.
For example /hadoop/yarn/.../oozietmp/myscript.sh
You have no idea before which directory this is or on which server it is located. It is in some yarn tmp dir.
The file tag is there to put something into this temp dir. And you can rename the file as well using the # syntax.
So if your shell script is in HDFS in hdfs://tmp/myfolder/myNewScript.sh
But you do not want to change the exec tag for some reason.
You can do
<file>/tmp/myfolder/myNewScript.sh#myscript.sh</file>
And oozie will take the file from HDFS put it into the tmp folder before execution and rename it.
You can use the file tag to upload any kind of files ( like jars or other dependencies )
As far as I can see the ${EXEC} is just a variable they set somewhere with no specific meaning.
Oh last but not least, if you want to avoid the file tag you can also simply put these files into a lib folder in the workflow folder. Oozie will upload all of these files per default.
