Created 05-26-2016 08:41 AM
Hi All,
I am trying to do sqoop import with where condition/free form query with oozie and it is failing.
my oozie action looks like :
<action name="sqoop-action"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${deleteHdfsPath}"/> </prepare> <configuration> <property> <name>oozie.hive.defaults</name> <value>/usr/hdp/current/hive-client/conf/hive-site.xml</value> </property> </configuration> <command>${command}</command> </sqoop> <ok to="ok" /> <error to="kill" /> </action>
In properties file for command argument looks like this :
import --connect jdbc:mysql://<>:3306/test --username hive --password hive --query "SELECT * FROM _table WHERE \$CONDITIONS AND _id > 0 AND _id <= 1000000" --split-by _id --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table
and I have tried with below also.
import --connect jdbc:mysql://ip-172-31-5-150.ec2.internal:3306/test --username hive --password hive --table _table --where "_id > 0 AND _id <= 1000000" --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table
I am getting below error with both the way, it saying Unrecognized argument.
Sqoop command arguments : import --connect jdbc:mysql://<>:3306/test --username hive --password hive --query SELECT * FROM _table WHERE $CONDITIONS AND _id > 0 AND _id <= 1000000 --split-by _id --fields-terminated-by | --target-dir /apps/hive/warehouse/hive_table Fetching child yarn jobs tag id : oozie-fcc0762084b58ea3c408ef7887cc26a7 2016-05-25 23:58:33,814 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /172.31.5.150:8050 Child yarn jobs are found - ================================================================= >>> Invoking Sqoop command line now >>> 3601 [main] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 2016-05-25 23:58:34,126 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 3642 [main] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6.2.4.0.0-169 2016-05-25 23:58:34,167 INFO [main] sqoop.Sqoop (Sqoop.java:<init>(97)) - Running Sqoop version: 1.4.6.2.4.0.0-169 3667 [main] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead. 2016-05-25 23:58:34,192 WARN [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1026)) - Setting your password on the command-line is insecure. Consider using -P instead. 3668 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Error parsing arguments for import: 2016-05-25 23:58:34,193 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(304)) - Error parsing arguments for import: 3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: * 2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: * 3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: FROM 2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: FROM 3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _product 2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _product 3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: WHERE 2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: WHERE 3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: $CONDITIONS 2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: $CONDITIONS 3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: AND 2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND 3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id 2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id 3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: > 2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: > 3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 0 2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 1000000 3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: AND 2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND 3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id 2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id 3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: <= 2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: <= 3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: 1000000 2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 2000000 3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --split-by 2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --split-by 3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: _id 2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id 3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --fields-terminated-by 2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --fields-terminated-by 3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: | 2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: | 3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: --target-dir 2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --target-dir 3679 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool - Unrecognized argument: /apps/hive/warehouse/n_product_2 2016-05-25 23:58:34,204 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: /apps/hive/warehouse/n_product_2 Intercepting System.exit(1) <<< Invocation of Main class completed <<<
Sqoop command working fine in both the case, But running with oozie not working. All through simple sqoop action is working(without query - whole table fetch) fine with oozie.. but some how free form query is not working with oozie.
Let me know, if I am missing something.
Thanks in advance,
Ankit
Created 05-26-2016 10:11 PM
Hi @Ankit A
In this case you need to set your "import *" statement into set of arguments as mentioned by @Christine so that oozie can pass it to sqoop.
Here is an example workflow. Link
<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf"> <start to="sqoop-freeform-node"/> <action name="sqoop-freeform-node"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform"/> <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data"/> </prepare> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:hsqldb:file:db.hsqldb</arg> <arg>--username</arg> <arg>sa</arg> <arg>--password</arg> <arg></arg> <arg>--verbose</arg> <arg>--query</arg> <arg>select TT.I, TT.S from TT where $CONDITIONS</arg> <arg>--target-dir</arg> <arg>/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform</arg> <arg>-m</arg> <arg>1</arg> <file>db.hsqldb.properties#db.hsqldb.properties</file> <file>db.hsqldb.script#db.hsqldb.script</file> </sqoop> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Sqoop free form failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
Created 05-26-2016 08:04 PM
Try setting the sqoop command up with arguments (nothing should be in command). Properties and values for each element should be entered as separate arguments.
arg: import arg: --connect arg: jdbc:mysql.... arg: --username arg: [username] arg: --password-file arg: [password file] arg: --query arg: select ..... arg: --target-dir arg: [target]
Created 05-29-2016 01:02 PM
Thanks that worked
Created 05-26-2016 10:11 PM
Hi @Ankit A
In this case you need to set your "import *" statement into set of arguments as mentioned by @Christine so that oozie can pass it to sqoop.
Here is an example workflow. Link
<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf"> <start to="sqoop-freeform-node"/> <action name="sqoop-freeform-node"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform"/> <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data"/> </prepare> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:hsqldb:file:db.hsqldb</arg> <arg>--username</arg> <arg>sa</arg> <arg>--password</arg> <arg></arg> <arg>--verbose</arg> <arg>--query</arg> <arg>select TT.I, TT.S from TT where $CONDITIONS</arg> <arg>--target-dir</arg> <arg>/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform</arg> <arg>-m</arg> <arg>1</arg> <file>db.hsqldb.properties#db.hsqldb.properties</file> <file>db.hsqldb.script#db.hsqldb.script</file> </sqoop> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Sqoop free form failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
Created 05-29-2016 01:02 PM
Thanks, That worked