Support Questions
Find answers, ask questions, and share your expertise

how to store pig output to the hive table using oozie ?

how to store pig output to the hive table using oozie ?

Expert Contributor

i have created hdinsight cluster on azure. i want to store pig output to the hive table using oozie job. how can i do this ? can anyone give me the example of Workflow and job file for this.

thank you

1 REPLY 1
Highlighted

Re: how to store pig output to the hive table using oozie ?

@heta desai

Basically, your Oozie workflow is required to have a Pig action in order to execute a pig script.

Here is a generic outline of the same :

<workflow-app>
...
<action name="[NODE-NAME]">
<pig>
...
<script>[PIG-SCRIPT]</script>
<argument>[ARGUMENT-VALUE]</argument>
...
<argument>[ARGUMENT-VALUE]</argument>
...
</pig>
<ok to="[NODE-NAME]"/>
<error to="[NODE-NAME]"/>
</action>
...
</workflow-app>

You may refer this wiki for various use-cases of Pig via Oozie.

In the Pig script, you must use HCatStorer to save Pig output to Hive.

STORE my_pig_output INTO 'dbname.tablename' USING org.apache.hive.hcatalog.pig.HCatStorer();

Here is an example of Oozie Workflow with Pig Action. Although, this example does not cover this exact scenario that you are looking for, it will still give you a clear idea of what you will need to do in order to create the required workflow.

As always, if this answer helps you, please consider accepting it.