Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

When using Sqoop within Oozie, Sqoop fails during Hive import because Hive shell disabled,Sqoop job fails when run from Oozie

avatar
Explorer

Hello,

When I run my sqoop command from the command line, it succeeds and imports results into hive. But when I run the same sqoop command from oozie, it fails with this message:

org.apache.sqoop.hive.HiveImport - Hive shell has been disabled. Please use beeline instead.

I'm aware that we've disabled the Hive shell, however sqoop seems to function correctly -- including doing the hive import -- when launched directly from command line.

It looks like when I run sqoop from oozie, it attempts to use old hive shell under the covers -- but it does not do this when I run sqoop directly from the command line. Any idea why that is the case? Is there some configuration parameter in oozie that is needed to have it use beeline? My google searching hasn't turned anything up so far.

Thanks, Jake

,

Hello,

I have a sqoop command that imports to hive that is successful when I run the sqoop command directly from the command line. However when I try to run this same command from an Oozie workflow, sqoop fails with the message:

org.apache.sqoop.hive.HiveImport - Hive shell has been disabled. Please use beeline instead.

I am aware that we have disabled our hive shell, however I don't get this error when I run sqoop directly from the command line. I have tried the sqoop command from the edge nodes and master nodes, and it succeeds in both places. Is there a configuration parameter I would need to use when I run sqoop from oozie to avoid this issue?

Thanks,

Jake

1 ACCEPTED SOLUTION

avatar
Master Guru

@Jake Kugel

I was able to reproduce this on my local cluster and resolve it.

You need to add below property in job.properties file.

oozie.action.sharelib.for.sqoop=sqoop,hive

Note - Sqoop uses CliDriver class and does not use hive script whereas Oozie was not able to find that class in Classpath hence it was trying to use hive cli.

Also,

To avoid further issues, please add hive-site.xml in your workflow.xml

<file>$some_location_on_hdfs/hive-site.xml#hive-site.xml</file>

Credit goes to @pjoseph for finding the reason why this was happening! 🙂

View solution in original post

7 REPLIES 7

avatar
Explorer

Sorry for the repeated text in title and detail.

avatar

There may be a compatibility issue with your scoop workflow definition XML. Could you post it?

What HDP version are you using?

Here is the sqoop definition for HDP 4.2.

http://oozie.apache.org/docs/4.2.0/DG_SqoopActionExtension.html

avatar
Explorer

Thanks for your reply, and thanks for link, I will look. Here are versions of the components I'm using:

Oozie version: 4.2.0.2.3.4.0-3485 Sqoop version: 1.4.6.2.3.4.0-3485 Hadoop version: 2.7.1.2.3.4.0-3485

And here is the workflow.xml, items in square brackets were things that I removed because they had sensitive information of some sort in them.

<workflow-app name="sqoop-report" xmlns="uri:oozie:workflow:0.4">
    <start to="sqoop-from-db2"/>
    <action name="sqoop-from-db2">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>oozie.use.system.libpath</name>
                    <value>true</value>
                </property>
            </configuration>
              <arg>import</arg>
              <arg>--connect</arg>
              <arg>[JDBC URL]</arg>
              <arg>-m</arg>
              <arg>1</arg>
              <arg>--username</arg>
              <arg>[DB2 USERNAME]</arg>
              <arg>--password-file</arg>
              <arg>[PASSWORD FILE]</arg>
              <arg>--target-dir</arg>
              <arg>[TARGET DIR]</arg>
              <arg>--hive-table</arg>
              <arg>[TARGET TABLE]</arg>
              <arg>--hive-import</arg>
              <arg>--query</arg>
              <arg>[FREE FORM SQL]</arg>
        </sqoop>
        <ok to="end"/>
        <error to="kill"/>
    </action>
    <kill name="kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

avatar
Master Guru

@Jake Kugel

I was able to reproduce this on my local cluster and resolve it.

You need to add below property in job.properties file.

oozie.action.sharelib.for.sqoop=sqoop,hive

Note - Sqoop uses CliDriver class and does not use hive script whereas Oozie was not able to find that class in Classpath hence it was trying to use hive cli.

Also,

To avoid further issues, please add hive-site.xml in your workflow.xml

<file>$some_location_on_hdfs/hive-site.xml#hive-site.xml</file>

Credit goes to @pjoseph for finding the reason why this was happening! 🙂

avatar
Expert Contributor

Adding to @Kuldeep Kulkarni comments,

And make sure that appropriate driver jar on oozie shared lib hdfs location.

avatar
Explorer

@Kuldeep Kulkarni thank you for the reply! I don't have the original environment up anymore so I wasn't able to try it myself, but I appreciate the response. I have heard that an additional issue I would have encountered had I resolved the 'Hive shell has been disabled' was the fact that our Hive is configured to use tez execution engine, and as far as I know, sqoop only supports MapReduce. But I think that is a separate issue from the original post here.

avatar
Master Guru

@Jake Kugel - Please have a look at my answer and accept it if your job runs successfully!