Reply
Highlighted
Expert Contributor
Posts: 69
Registered: ‎11-24-2017

Can't get Oozie to work with Sqoop actions and MySQL (CDH 5.13.1)

Hi everyone!

I have cluster with CDH 5.13.1 installed thorugh Cloudera Manager with Parcel option.

 

I am having some troubles to get Oozie working properly with Sqoop actions and MySQL database.

 

Just after CDH installation I searched if a mysql jar connector was already present in the system, I found it in 

 

/usr/share/cmf/cloudera-scm-telepub/jars/mysql-connector-java-5.1.15.jar

so I just copied that file and place it in /var/lib/sqoop/ to enable Sqoop import from MySQL. After that Sqoop can sucessfully connect to MySQL db, for example the following sqoop command in the shell works fine:

 

 

 

sqoop list-tables --driver com.mysql.jdbc.Driver --connect jdbc:mysql://10.0.0.4/employees --username cloudera --password-file /user/cloudera/pwd

 

where /user/cloudera/pwd is the hdfs path to the file where I've placed the MySQL password.

 

Then I've tried to configure a workflow with the same Sqoop command.

I've added the  mysql-connector-java-5.1.15.jar file to the oozie share lib in hdfs, and set owner and permissions:

 

 

-rwxrwxr-x   3 oozie oozie      15618 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/logredactor-1.0.3.jar
-rwxrwxr-x   3 oozie oozie     388864 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/mail-1.4.jar
-rwxrwxr-x   3 oozie oozie      85448 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/metrics-core-3.0.2.jar
-rwxrwxr-x   3 oozie oozie      15800 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/metrics-json-3.0.2.jar
-rwxrwxr-x   3 oozie oozie      30865 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/metrics-jvm-3.0.2.jar
-rwxrwxrwx   3 oozie oozie     785998 2018-01-21 00:24 /user/oozie/share/lib/lib_20180118201009/sqoop/mysql-connector-java-5.1.15.jar
-rwxrwxr-x   3 oozie oozie    1292696 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/netty-3.10.6.Final.jar
-rwxrwxr-x   3 oozie oozie      18246 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/oozie-sharelib-sqoop-4.1.0-cdh5.13.1.jar
-rwxrwxr-x   3 oozie oozie      18246 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/oozie-sharelib-sqoop.jar
-rwxrwxr-x   3 oozie oozie      19827 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/opencsv-2.3.jar
-rwxrwxr-x   3 oozie oozie      29555 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/paranamer-2.3.jar
-rwxrwxr-x   3 oozie oozie     106445 2018-01-18 20:11 /user/oozie/share/lib/lib_20180118201009/sqoop/parquet-avro.jar

 

I've also checked in the Cloudera Manager console if the Oozie share lib was properly configured: 

 

 

Schermata_2018_01_21_alle_12_15_32

 

 

 

This is the workflow file

workflow.xml

 

<workflow-app name="OOZIE_SQOOP_WF" xmlns="uri:oozie:workflow:0.4">

        <start to="sqoop_action" />

        <action name="sqoop_action">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>

            <prepare>
                <delete path="${nameNode}/user/cloudera/sqoop/titles"/>
            </prepare>

            <command>list-tables --driver com.mysql.jdbc.Driver --connect jdbc:mysql://10.0.0.4/employees --username cloudera --password-file /user/cloudera/pwd</command>

        </sqoop>
        <ok to="success"/>
        <error to="fail"/>
    </action>

	<kill name="fail">
                <message>JOB FAILED!</message>
        </kill>

        <end name="success"/>

</workflow-app>

 

 

This is the job.properties file:

job.properties

 

nameNode=hdfs://master:8020
jobTracker=master:8032
oozie.wf.application.path=${nameNode}/user/cloudera/oozie/list-tables-app
oozie.use.system.libpath=true

 

 

But I got an error, it seems Oozie can't find the driver for mysql. From the yarn logs I can read:

 

Fetching child yarn jobs
tag id : oozie-126c51301f495d4c66757805b23eb8be
2018-01-21 10:49:25,093 [main] INFO  org.apache.hadoop.yarn.client.RMProxy  - Connecting to ResourceManager at master/10.0.0.4:8032
Child yarn jobs are found -
=================================================================

>>> Invoking Sqoop command line now >>>

2018-01-21 10:49:25,624 [main] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2018-01-21 10:49:25,762 [main] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.13.1
2018-01-21 10:49:25,962 [main] WARN  org.apache.sqoop.ConnFactory  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2018-01-21 10:49:26,001 [main] WARN  org.apache.sqoop.ConnFactory  - Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manage$
2018-01-21 10:49:26,029 [main] INFO  org.apache.sqoop.manager.SqlManager  - Using default fetchSize of 1000
2018-01-21 10:49:26,030 [main] ERROR org.apache.sqoop.Sqoop  - Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver

<<< Invocation of Sqoop command completed <<<

No child hadoop job is executed.
Intercepting System.exit(1)

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

Oozie Launcher failed, finishing Hadoop job gracefully

Oozie Launcher, uploading action data to HDFS sequence file: hdfs://master:8020/user/cloudera/oozie-oozi/0000010-180120225408257-oozie-oozi-W/sqoop_action--sqoop/action-data.seq
2018-01-21 10:49:26,418 [main] INFO  org.apache.hadoop.io.compress.zlib.ZlibFactory  - Successfully loaded & initialized native-zlib library
2018-01-21 10:49:26,419 [main] INFO  org.apache.hadoop.io.compress.CodecPool  - Got brand-new compressor [.deflate]
Successfully reset security manager from org.apache.oozie.action.hadoop.LauncherSecurityManager@3c5e76f7 to null

Oozie Launcher ends

 

Any idea why it can't find the driver, although I put the connector in the share lib with proper ownership and permissions and in the job.properties the following property is set to true?

oozie.use.system.libpath=true

 

What am I doing wrong?

 

Thanks for any help!

 

 

Announcements