Community Articles
Find and share helpful community-sourced technical articles.
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.
Labels (1)
Super Guru

Below are the steps to run Hive(TEZ) query in a shell script using Oozie shell action

.

1. Configure job.properties

Example:

#*************************************************
#  job.properties
#*************************************************
nameNode=hdfs://<namenode-fqdn>:8020
jobTracker=<resourcemanager-host-fqdn>:8050
queueName=default
examplesRoot=examples
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/${user.name}/

.

2. Configure Workflow.xml

Example:

<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.4"
name="test-shell-with-kerberos-wf">
    <global>
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>mapred.job.queue.name</name>
                <value>${queueName}</value>
            </property>
            <property>
                <name>tez.queue.name</name>
                <value>${queueName}</value>
            </property>
        </configuration>
    </global>
      <credentials>
      <credential name="hive_credentials" type="hcat">
        <property>
          <name>hcat.metastore.uri</name>
          <value>thrift://<metastore-server>:9083</value>
        </property>
        <property>
          <name>hcat.metastore.principal</name>
          <value>hive/_HOST@REALM</value>
        </property>
      </credential>
    </credentials>
    <start to="run-shell-script"/>
       <action name="run-shell-script" cred="hive_credentials">
        <shell xmlns="uri:oozie:shell-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
                <property>
                    <name>tez.lib.uris</name>
                    <value>/hdp/apps/<VERSION>/tez/tez.tar.gz</value>
                </property>
            </configuration>
            <exec>/user/<username>/hive.sh</exec>
            <file>/user/<username>/hive.sh#hive.sh</file>
         </shell>
        <ok to="end"/>
        <error to="killnode"/>
    </action>
    <kill name="killnode">
        <message>Job failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

.

3. Write sample shell script

Example:

#!/bin/bash
hive --hiveconf mapreduce.job.credentials.binary=$HADOOP_TOKEN_FILE_LOCATION --hiveconf tez.credentials.path=$HADOOP_TOKEN_FILE_LOCATION -e 'select * from test_hive;'

4. Upload workflow.xml and shell script to "oozie.wf.application.path" defined in job.properties

.

5. Follow below command to run Oozie workflow

oozie job -oozie http://<oozie-server-hostname>:11000/oozie -config /$PATH/job.properties -run

.

Please comment if you have any question! Happy Hadooping!! 🙂

3,659 Views
Comments
Expert Contributor

I faced the following error where the shell script is running over 24 hours, and failing to launch hive scripts after 24 hours with following error.

Workaround: Increase the following Property Value in hive-site.xml and restart Hive Metastore to the desired need of oozie shell script to continue running.

  • hive.cluster.delegation.token.renew-interval (Default: 86400000 i.e. 24hrs)
  • hive.cluster.delegation.token.max-lifetime (Default: 604800000 i.e. 7days)

yarn application log

Stdoutput 16/10/16 17:12:00 [main]: WARN hive.metastore: Failed to connect to the MetaStore Server...
Stdoutput org.apache.thrift.transport.TTransportException: Peer indicated failure: DIGEST-MD5: IO error acquiring password

Hive MetaStore Error:

ERROR [pool-5-thread-198]: transport.TSaslTransport (TSaslTransport.java:open(315)) - SASL negotiation failure
javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or does not exist: owner=user, renewer=oozie, realUser=oozie/oozie.host.name@EXAMPLE.COM, issueDate=1476560270232, maxDate=1477165070232, sequenceNumber=51, masterKeyId=714]
Don't have an account?
Version history
Last update:
‎08-22-2016 11:59 PM
Updated by:
Contributors
Top Kudoed Authors