Reply
Highlighted
rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Loading data from HIve to MySQL using Sqoop Oozie

 

Hello,


I am facing some grave issue (tried unsuccessful 196 times) while loading data using Oozie/Sqoop (sqoop command inside Oozie) to MySQL table. If there is only one column of data in HDFS then there is no issue but when there are more than 1 column e.g. 2 columns, data does not get loaded to MySQL.

If I run the Sqoop by itself, then the data gets loaded to MySQL but when I put inside Oozie, data does not get laoded.
The workflow.xml has 2 parts, 1st it loads data from Hive table to HDFS and 2nd loads data from HDFS to MySQL.
I am using ClouderaVM.

-

hive> CREATE EXTERNAL TABLE IF NOT EXISTS foo (
name string,
city string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/user/cloudera/foo';

-

$ vi foo
1 a
4 b
hive> load data local inpath '/home/cloudera/foo' into table foo;

-

mysql> CREATE TABLE `foo` (`id` int(11) DEFAULT NULL, `name` varchar(22) DEFAULT NULL );

-

workflow.xml:
<?xml version="1.0" encoding="UTF-8"?>
<workflow-app xmlns="uri:oozie:workflow:0.2" name="etl-wf">
<start to="hive-node"/>
<action name="hive-node">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<job-xml>hive-site.xml</job-xml>
<script>script.q</script>
</hive>
<ok to="sqoop-node"/>
<error to="fail"/>
</action>
<action name="sqoop-node">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>export --connect jdbc:mysql://localhost/test --username rio --password r005 --table foo --export-dir /user/cloudera/test --input-fields-terminated-by '\t' --input-lines-terminated-by '\n'</command>
</sqoop>
<ok to="end"/>
<error to="fail"/>
</action>

<kill name="fail">
<message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>

Note: here 1st part works i.e. data gets laoded to test table (hdfs: /user/cloudera/test) in hive but does not get loaded from hdfs: /user/cloudera/test to MySQL foo table.

-

vi script.q:

CREATE EXTERNAL TABLE IF NOT EXISTS test (
id int,
city string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION
'/user/cloudera/test';

INSERT OVERWRITE table test SELECT * FROM foo;

-

stderr logs

Note: /tmp/sqoop-mapred/compile/d4f769ef09667984820f21a38ae27bb4/foo.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
java.io.IOException: Can't export data, please check task tracker logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.NumberFormatException: For input string: "1 a"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Integer.parseInt(Integer.java:458)
at java
Intercepting System.exit(1)
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

-

New Contributor
Posts: 2
Registered: ‎05-21-2014

Re: Loading data from HIve to MySQL using Sqoop Oozie

Your workflow.xml file has some issues in the sqoop action.

 

Basically, you need to pass in the command and then each argument should get its own line.

 

<command>export</command>

<arg>--connect</arg>

<arg>jdbc:mysql://localhost/test</arg>

...

etc.

 

Editing oozie workflows is always painful.  Good luck.

rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Re: Loading data from HIve to MySQL using Sqoop Oozie

I did put on tags still not working.

rio
Explorer
Posts: 48
Registered: ‎04-18-2014

Re: Loading data from HIve to MySQL using Sqoop Oozie

Can Oozie parse data if it is tab delimited? Does columns need to enclosed by quotation mark?

 

Thanks!

 

Rio