Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sqoop free from query with oozie not working

avatar
Explorer

Hi All,

I am trying to do sqoop import with where condition/free form query with oozie and it is failing.

my oozie action looks like :

<action name="sqoop-action">
   <sqoop xmlns="uri:oozie:sqoop-action:0.2">
      <job-tracker>${jobTracker}</job-tracker>
      <name-node>${nameNode}</name-node>
      <prepare>
         <delete path="${deleteHdfsPath}"/>
      </prepare>
      <configuration>
         <property>
            <name>oozie.hive.defaults</name>
            <value>/usr/hdp/current/hive-client/conf/hive-site.xml</value>
         </property>
      </configuration>
      <command>${command}</command>
   </sqoop>
   <ok to="ok" />
   <error to="kill" />
</action>

In properties file for command argument looks like this :

import --connect jdbc:mysql://<>:3306/test --username hive --password hive --query "SELECT * FROM _table WHERE \$CONDITIONS AND _id > 0 AND _id <= 1000000" --split-by _id --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table

and I have tried with below also.

import --connect jdbc:mysql://ip-172-31-5-150.ec2.internal:3306/test --username hive --password hive --table _table --where "_id > 0 AND _id <= 1000000" --fields-terminated-by \| --target-dir /apps/hive/warehouse/hive_table

I am getting below error with both the way, it saying Unrecognized argument.

Sqoop command arguments :
             import
             --connect
             jdbc:mysql://<>:3306/test
             --username
             hive
             --password
             hive
             --query
             SELECT
             *
             FROM
             _table
             WHERE
             $CONDITIONS
             AND
             _id
             >
             0
             AND
             _id
             <=
             1000000
             --split-by
             _id
             --fields-terminated-by
             |
             --target-dir
             /apps/hive/warehouse/hive_table
Fetching child yarn jobs
tag id : oozie-fcc0762084b58ea3c408ef7887cc26a7
2016-05-25 23:58:33,814 INFO  [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /172.31.5.150:8050
Child yarn jobs are found - 
=================================================================
>>> Invoking Sqoop command line now >>>
3601 [main] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
2016-05-25 23:58:34,126 WARN  [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
3642 [main] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6.2.4.0.0-169
2016-05-25 23:58:34,167 INFO  [main] sqoop.Sqoop (Sqoop.java:<init>(97)) - Running Sqoop version: 1.4.6.2.4.0.0-169
3667 [main] WARN  org.apache.sqoop.tool.BaseSqoopTool  - Setting your password on the command-line is insecure. Consider using -P instead.
2016-05-25 23:58:34,192 WARN  [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1026)) - Setting your password on the command-line is insecure. Consider using -P instead.
3668 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Error parsing arguments for import:
2016-05-25 23:58:34,193 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(304)) - Error parsing arguments for import:
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: *
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: *
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: FROM
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: FROM
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: _product
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _product
3669 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: WHERE
2016-05-25 23:58:34,194 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: WHERE
3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: $CONDITIONS
2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: $CONDITIONS
3670 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: AND
2016-05-25 23:58:34,195 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: _id
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: >
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: >
3671 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: 0
2016-05-25 23:58:34,196 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 1000000
3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: AND
2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: AND
3673 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: _id
2016-05-25 23:58:34,198 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: <=
2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: <=
3677 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: 1000000
2016-05-25 23:58:34,202 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: 2000000
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: --split-by
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --split-by
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: _id
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: _id
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: --fields-terminated-by
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --fields-terminated-by
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: |
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: |
3678 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: --target-dir
2016-05-25 23:58:34,203 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: --target-dir
3679 [main] ERROR org.apache.sqoop.tool.BaseSqoopTool  - Unrecognized argument: /apps/hive/warehouse/n_product_2
2016-05-25 23:58:34,204 ERROR [main] tool.BaseSqoopTool (BaseSqoopTool.java:hasUnrecognizedArgs(307)) - Unrecognized argument: /apps/hive/warehouse/n_product_2
Intercepting System.exit(1)
<<< Invocation of Main class completed <<<

Sqoop command working fine in both the case, But running with oozie not working. All through simple sqoop action is working(without query - whole table fetch) fine with oozie.. but some how free form query is not working with oozie.

Let me know, if I am missing something.

Thanks in advance,

Ankit

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi @Ankit A

In this case you need to set your "import *" statement into set of arguments as mentioned by @Christine so that oozie can pass it to sqoop.

Here is an example workflow. Link

<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf">
    <start to="sqoop-freeform-node"/>

    <action name="sqoop-freeform-node">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform"/>
                <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <arg>import</arg>
            <arg>--connect</arg>
            <arg>jdbc:hsqldb:file:db.hsqldb</arg>
            <arg>--username</arg>
            <arg>sa</arg>
            <arg>--password</arg>
            <arg></arg>
            <arg>--verbose</arg>
            <arg>--query</arg>
            <arg>select TT.I, TT.S from TT where $CONDITIONS</arg>
            <arg>--target-dir</arg>
            <arg>/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform</arg>
            <arg>-m</arg>
            <arg>1</arg>
            <file>db.hsqldb.properties#db.hsqldb.properties</file>
            <file>db.hsqldb.script#db.hsqldb.script</file>
        </sqoop>
        <ok to="end"/>
        <error to="fail"/>
    </action>


    <kill name="fail">
        <message>Sqoop free form failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>



View solution in original post

4 REPLIES 4

avatar
Rising Star

Try setting the sqoop command up with arguments (nothing should be in command). Properties and values for each element should be entered as separate arguments.

arg:  import
arg:  --connect
arg:  jdbc:mysql....
arg:  --username
arg:  [username]
arg:  --password-file
arg:  [password file]
arg:  --query
arg:  select .....
arg:  --target-dir
arg:  [target]

avatar
Explorer

Thanks that worked

avatar
Super Guru

Hi @Ankit A

In this case you need to set your "import *" statement into set of arguments as mentioned by @Christine so that oozie can pass it to sqoop.

Here is an example workflow. Link

<workflow-app xmlns="uri:oozie:workflow:0.2" name="sqoop-freeform-wf">
    <start to="sqoop-freeform-node"/>

    <action name="sqoop-freeform-node">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <prepare>
                <delete path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform"/>
                <mkdir path="${nameNode}/user/${wf:user()}/${examplesRoot}/output-data"/>
            </prepare>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <arg>import</arg>
            <arg>--connect</arg>
            <arg>jdbc:hsqldb:file:db.hsqldb</arg>
            <arg>--username</arg>
            <arg>sa</arg>
            <arg>--password</arg>
            <arg></arg>
            <arg>--verbose</arg>
            <arg>--query</arg>
            <arg>select TT.I, TT.S from TT where $CONDITIONS</arg>
            <arg>--target-dir</arg>
            <arg>/user/${wf:user()}/${examplesRoot}/output-data/sqoop-freeform</arg>
            <arg>-m</arg>
            <arg>1</arg>
            <file>db.hsqldb.properties#db.hsqldb.properties</file>
            <file>db.hsqldb.script#db.hsqldb.script</file>
        </sqoop>
        <ok to="end"/>
        <error to="fail"/>
    </action>


    <kill name="fail">
        <message>Sqoop free form failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>



avatar
Explorer

Thanks, That worked