Support Questions

Find answers, ask questions, and share your expertise

How to pass table names from a file to oozie workflow

Explorer

I have a workflow in oozie. In this workflow I want to pass a table name as an argument. The table names are present in a file tables.txt I want to pass the table names from tables.txt to the workflow.

<workflow-app name="Shell_test" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-8f63"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="test_shell">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>shell.sh</exec>
            <argument>${table}</argument>
            <env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
            <file>/user/oozie/lib/test_shell.sh#shell.sh</file>
            <file>/user/oozie/input/tables.txt#tables.txt</file>
        </shell>
        <ok to="End"/>
            <error to="email-error"/>
        </action>
        <action name="email-error">
        <email xmlns="uri:oozie:email-action:0.2">
            <to>xxxxxxxxxx.com</to>
            <subject>Status of workflow ${table}</subject>
            <body>The workflow ${table} ${wf:id()} had issues and was killed. The error message is: ${wf:errorMessage(wf:lastErrorNode())}</body>
            <content_type>text/plain</content_type>
        </email>
        <ok to="end"/>
        <error to="end"/>
        </action>
        <end name="End"/>
    </workflow-app>

I was able to do this using the following in the workflow.

<argument>${input_file}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>/user/oozie/lib/test_shell.sh#shell.sh</file>
<file>/user/oozie/input/${input_file}#${input_file}</file>

Now I have a problem.

Say If the workflow fails for one of the tables in the input_file then I am not getting any email. I am getting email only if the workflow fails for the last table in the input_file.

Why is this happening and How can I get an email for every time the workflow fails? Or am I doing the whole process wrong.

Could anyone please explain and correct me where I am doing things in a wrong way.

1 REPLY 1

Explorer

Inside your shell script, you may want to check if the one of your hive table fails then script exit with non-zero code. Then the action will failed and goes to email-error.