Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to pass table names from a file to oozie workflow

Explorer

I have a workflow in oozie. In this workflow I want to pass a table name as an argument. The table names are present in a file tables.txt I want to pass the table names from tables.txt to the workflow.

<workflow-app name="Shell_test" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-8f63"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="test_shell">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <exec>shell.sh</exec>
            <argument>${table}</argument>
            <env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
            <file>/user/oozie/lib/test_shell.sh#shell.sh</file>
            <file>/user/oozie/input/tables.txt#tables.txt</file>
        </shell>
        <ok to="End"/>
            <error to="email-error"/>
        </action>
        <action name="email-error">
        <email xmlns="uri:oozie:email-action:0.2">
            <to>xxxxxxxxxx.com</to>
            <subject>Status of workflow ${table}</subject>
            <body>The workflow ${table} ${wf:id()} had issues and was killed. The error message is: ${wf:errorMessage(wf:lastErrorNode())}</body>
            <content_type>text/plain</content_type>
        </email>
        <ok to="end"/>
        <error to="end"/>
        </action>
        <end name="End"/>
    </workflow-app>

I was able to do this using the following in the workflow.

<argument>${input_file}</argument>
<env-var>HADOOP_USER_NAME=${wf:user()}</env-var>
<file>/user/oozie/lib/test_shell.sh#shell.sh</file>
<file>/user/oozie/input/${input_file}#${input_file}</file>

Now I have a problem.

Say If the workflow fails for one of the tables in the input_file then I am not getting any email. I am getting email only if the workflow fails for the last table in the input_file.

Why is this happening and How can I get an email for every time the workflow fails? Or am I doing the whole process wrong.

Could anyone please explain and correct me where I am doing things in a wrong way.

1 REPLY 1

Explorer

Inside your shell script, you may want to check if the one of your hive table fails then script exit with non-zero code. Then the action will failed and goes to email-error.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.