Support Questions

Find answers, ask questions, and share your expertise

Oozie Sqoop action fails on --hive-import

avatar
Expert Contributor

Hi!

I have the following workflow on Oozie with a Sqoop action that should import a mysql table into Hive:

 

workflow.xml (fails)

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by '\t' --hive-import --hive-table categories</command>
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

 

job.properties

nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/spark-app
oozie.use.system.libpath=true

 

 

This workflow fails always, but in the logs there is nothing about any exception or error. Also the same Sqoop command run from the shell works fine.

The following workflow instead completes with success (it does not import the table in Hive but only on HDFS):

 

workflow.xml 

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/cloudera/categories"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.compress.map.output</name>
                    <value>true</value>
                </property>
            </configuration>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by '\t' --warehouse-dir /user/cloudera</command>
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

Does anyone know how to get the first workflow with Hive import to complete?

 

 PS: I am using CDH quickstart 5.12.

 

1 ACCEPTED SOLUTION

avatar
Expert Contributor

 

Solved with the following:

 

job.properties

nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/sqoop-app
oozie.use.system.libpath=true
oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop
oozie.action.sharelib.for.hive=hive,hcatalog,sqoop

Copied hive-site.xml in the root folder of the application and added it in the workflow under <job-xml> tag:

 

workflow.xml

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/cloudera/categories"/>
            </prepare>
	<job-xml>hive-site.xml</job-xml>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by ',' --hive-import --hive-table cloudera.categories</command>	
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>

 

View solution in original post

1 REPLY 1

avatar
Expert Contributor

 

Solved with the following:

 

job.properties

nameNode=hdfs://quickstart.cloudera:8020
jobTracker=localhost:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/sqoop-app
oozie.use.system.libpath=true
oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop
oozie.action.sharelib.for.hive=hive,hcatalog,sqoop

Copied hive-site.xml in the root folder of the application and added it in the workflow under <job-xml> tag:

 

workflow.xml

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">
    
	<start to="sqoop_action" />		

	<action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/cloudera/categories"/>
            </prepare>
	<job-xml>hive-site.xml</job-xml>
            <command>import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories --fields-terminated-by ',' --hive-import --hive-table cloudera.categories</command>	
        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
		<message>JOB FAILED!</message>
	</kill>

	<end name="success"/>	
    
</workflow-app>