Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Oozie shell action: exec and file tags

avatar
Contributor

I'm a newbie in Oozie and I've read some Oozie shell action examples but this got me confused about certain things.

 

There are examples I've seen where there is no <file> tag.

 

Some example, like in Cloudera here, repeats the shell script in file tag:

 

 

<shell xmlns="uri:oozie:shell-action:0.2">
    <exec>check-hour.sh</exec>
    <argument>${earthquakeMinThreshold}</argument>
    <file>check-hour.sh</file>
</shell>

While in Oozie's website, writes the shell script (the reference `${EXEC}` from job.properties, which points to script.sh file) twice, separated by #.

 

 

<shell xmlns="uri:oozie:shell-action:0.1">
    ...
    <exec>${EXEC}</exec>
    <argument>A</argument>
    <argument>B</argument>
    <file>${EXEC}#${EXEC}</file>
</shell>

 

There are also examples I've seen where the path (HDFS or local?) is prepended before the `script.sh#script.sh` within the <file> tag.

 

 

<shell xmlns="uri:oozie:shell-action:0.1">
    ...
    <exec>script.sh</exec>
    <argument>A</argument>
    <argument>B</argument>
    <file>/path/script.sh#script.sh</file>
</shell>

As I understand, any shell script file can be included in the workflow HDFS path (same path where workflow.xml resides).

Can someone explain the differences in these examples and how `<exec>`, `<file>`, `script.sh#script.sh`, and the `/path/script.sh#script.sh` are used?

1 ACCEPTED SOLUTION

avatar
Mentor
Lets say you want to execute "script.sh"

1. If you have script.sh inside your WF/lib/ path on HDFS, you just need <exec>script.sh</exec>

2. If you have script.sh on an arbitrary path on HDFS, you need:

<exec>script.sh</exec>
<file>/path/to/script.sh#script.sh</file>

3. Use of the below form with (1) is redundant, but the subsequent form is when you want to invoke it as a different name:

<exec>script.sh</exec>
<file>script.sh#script.sh</file>

<exec>linked-script-name.sh</exec>
<file>original-script-name.sh#linked-script-name.sh</file>

View solution in original post

1 REPLY 1

avatar
Mentor
Lets say you want to execute "script.sh"

1. If you have script.sh inside your WF/lib/ path on HDFS, you just need <exec>script.sh</exec>

2. If you have script.sh on an arbitrary path on HDFS, you need:

<exec>script.sh</exec>
<file>/path/to/script.sh#script.sh</file>

3. Use of the below form with (1) is redundant, but the subsequent form is when you want to invoke it as a different name:

<exec>script.sh</exec>
<file>script.sh#script.sh</file>

<exec>linked-script-name.sh</exec>
<file>original-script-name.sh#linked-script-name.sh</file>