Reply
Expert Contributor
Posts: 64
Registered: ‎11-24-2017
Accepted Solution

Oozie Sqoop action fails on --hive-import

[ Edited ]

Hi!

I have the following workflow on Oozie with a Sqoop action that should import a mysql table into Hive:

 

workflow.xml (fails)

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by '\t' --hive-import --hive-table categories</command>
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

 

job.properties

nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/spark-app
oozie.use.system.libpath=true

 

 

This workflow fails always, but in the logs there is nothing about any exception or error. Also the same Sqoop command run from the shell works fine.

The following workflow instead completes with success (it does not import the table in Hive but only on HDFS):

 

workflow.xml 

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/cloudera/categories"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.compress.map.output</name>
                    <value>true</value>
                </property>
            </configuration>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by '\t' --warehouse-dir /user/cloudera</command>
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

Does anyone know how to get the first workflow with Hive import to complete?

 

 PS: I am using CDH quickstart 5.12.

 

Expert Contributor
Posts: 64
Registered: ‎11-24-2017

Re: Oozie Sqoop action fails on --hive-import

[ Edited ]

 

Solved with the following:

 

job.properties

nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/sqoop-app
oozie.use.system.libpath=true
oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop
oozie.action.sharelib.for.hive=hive,hcatalog,sqoop

Copied hive-site.xml in the root folder of the application and added it in the workflow under <job-xml> tag:

 

workflow.xml

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/cloudera/categories"/>
            </prepare>
	<job-xml>hive-site.xml</job-xml>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by ',' --hive-import --hive-table cloudera.categories</command>	
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

Announcements