Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
Master Collaborator

Summary

 

Due to the fact that there is not currently an Impala action within Oozie, this guide will show you how to create an Impala-based action within Oozie

 

 

Applies To

 

Impala, Oozie

 

Instructions

Currently there is not an Impala action, so you must use a shell action that calls impala-shell.  The shell script that calls impala-shell must also include an entry to set the PYTHON EGGS location.  Here is an example shell script:

 

#!/bin/bash
export PYTHON_EGG_CACHE=./myeggs
/usr/bin/kinit -kt YourKeytabFile.keytab -V <your username>
impala-shell -q "invalidate metadata"

 

NOTICE the PYTHON_EGG_CACHE, this is the location you must set or the job will fail.  This also does a kinit in the case of a kerberized cluster.  Here is the workflow that goes with that script:

 

 

<workflow-app name="shell-impala-invalidate-wf" xmlns="uri:oozie:workflow:0.4">
<start to="shell-impala-invalidate"/>
    <action name="shell-impala-invalidate">
      <shell xmlns="uri:oozie:shell-action:0.1">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
          <property>
            <name>mapred.job.queue.name</name>
            <value>${queueName}</value>
          </property>
        </configuration>
        <exec>shell-impala-invalidate.sh</exec>
        <file>shell-impala-invalidate.sh#shell-impala-invalidate.sh</file>
        <file>YourKeytabFile.keytab#YourKeytabFile.keytab</file>
      </shell>
      <ok to="end"/>
      <error to="kill"/>
    </action>
    <kill name="kill">
      <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
  <end name="end"/>
</workflow-app>

 

You must include the <file> tag with the shell script, but not the keytab part unless you are using kerberos.

 

 

References

14,609 Views
Comments
New Contributor

Hi there first of all Thank you for willing to enlighten me, I was wondering if I could do this by submitting a sample shell job xml on an Oozie Rest WEB API Service. Tried doing hive with the Oozie Rest WEB API and it accepts it, but not a shell, or maybe im doing it wrong.

 

I tried having just a simple job, with inside HDFS is the workflow.xml and the script to be used.

 

the script only contains

 

#!/bin/bash

 

impala-shell -f <local-file-directory>/file.sql

 

and im still having exit code -1 or something else

Master Collaborator

@xylenet, from what I'm told this may be a pretty involved process and there's no easy/quick answer to that question.  It might be something you'd get better traction on by starting a "custom REST API call in Oozie" topic in the Oozie board area?  More oozie experts will be listening over there.

 

HTH,

 

Clint

New Contributor

Hi Clint, thanks for the additional link. Will be posting my concerns there and hope for the best.

 

And btw, i may have missed that but, what python egss are we talking about here. Im just using the quickstart of cloudera.

Master Guru

Impala's 'impala-shell' is a python based program. It utilises some eggs to run itself, so it needs a cache dir to work with. By specifying PYTHON_EGG_CACHE env-var to a relative path, you are allowing it to use a writable location within the container environment/user restrictions to do that.

Explorer

How does setting PYTHON_EGG_CACHE help if impala-shell script rewrites it's value? See lines 36-39

https://github.com/cloudera/Impala/blob/cdh5-trunk/shell/impala-shell#L36-L39

 

My workaround is to add this statement into a script before running impala-shell. This works with CDH 5.5.0 and impala 2.3.0-cdh5.5.0

#a workaround for the 'Error In Running Impala From Oozie' issue
#impala-shell uses $USER and not $(whoami) 
if [ "$(whoami)" = "yarn" ]; then
export USER=yarn
export PYTHON_EGG_CACHE=/tmp/impala-shell-python-egg-cache-${USER}
fi

echo "$QUERY" | impala-shell -i "localhost" -f -

 

Explorer
Thanks for this post, saved me some pain.
New Contributor

Hi ,

 

I am trying to use oozie shell action to call shell script , Inside the shell there are few hive commands i am using.

 

When i run the oozie jobs it always fails.

 

We are using CDH 5.4.8, Can anyone suggest is it possible with shell to call hive command or not.

 

The script works standalone but with oozie it's not working.

Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
6 of 6
Last update:
‎10-01-2015 09:39 AM
Updated by:
 
Top Kudoed Authors