Support Questions

Find answers, ask questions, and share your expertise

Oozie Sqoop Action works for list-databases but does not work for import

avatar
Expert Contributor

I'm using HDP 2.3.2 sandbox. To make sqoop from oozie run, I needed:

- set port forwarding for 10020 in sandbox to remove connection refuse issue

- copy oracle jdbc drive to HDFS /user/oozie/share/lib/lib_20151027124452/sqoop directory

Now, I can run sqoop list-tables or list-databases command; however, sqoop import fails with both HDFS and Hive.

Working command:

list-tables --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11

Not Working command:

sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --table A_BASE --verbose --target-dir /user/hue/test1 -m 1

sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --hive-table A_BASE --table A_BASE --verbose --hive-import -m 1

These commands all work from a SSH command line.

The error message is: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

I wish we could see the detail debug information for this error return.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This is my summary:

1. download HDP sandbox

2. create a worflow that does list-databases and run it from Hue. We expect the failure.

3. To see the error log, localhost:19888/jobhistory/?user.name=hue, click job Id link, click value 1 link for Maps at the bottom of tables. In the new web page, click logs link. You should see java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver.

4. open SSH shell as root. Copy oracle jdbc jar to /home/oozie from your local file system.

5. run su - oozie

6. run hdfs dfs -put ojdbc7-12.1.0.1.0.jar /user/oozie/share/lib/lib_20151027124452/sqoop

7. check if oracle jdbc jar is copied or not from Ambari. Use HDFS Files viewer from clicking the second icon from the right side on the Ambari page. Navigate it /user/oozie/share/lib/lib_20151027124452/sqoop.

8. Restart sandbox.

9. Run the workflow created at step 2.

10. It should work.

11. create another workflow that does import Hive and run it from Hue.

12. You see the warning message: 2016-01-28 21:10:23,398 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. Intercepting System.exit(1)

13. Need the followings two things:

(1) add oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop >> To do this from Hue oozie workflow web page, click "Edit properties" link and add this property to Oozie parameters. The oozie.use.system.libpath is already set true for default. Just add it.

(2) copy hive-site.xml to lib dir >> To do this from Hue, click File Browser, click oozie that takes you to /user/hue/oozie. Click workspaces, click _hue_xxx directory for the current workflow, create lib directory in the identified directory, copy hive-site.xml file that contains something like: (you don't need to update jdbc connection string of yours; it looks it needs hive.metastore.uris value; so maybe you can delete the first property, which I have not tried yet.)

<configuration>

<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value> <description>JDBC connect string for a JDBC metastore</description>

</property>

<property> <name>hive.metastore.uris</name> <value>thrift://sandbox.hortonworks.com:9083</value>

</property>

</configuration>

Once this file is created, go back to the oozie workflow editor, and click Add Path button and select hive-site.xml. Save it.

14. Run the workflow. It should run.

View solution in original post

17 REPLIES 17

avatar
Master Mentor

@Shigeru Takehara please make sure you have the following property set

oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop

here's a link to my explanation. Please post errors from the oozie job run.

avatar
Expert Contributor

Where should we set oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop?

I'm using Hue's oozie web page interface. There is "edit properties" link where I could set oozie parameters and hadoop job properties. Since I don't know exact, I set both, but the result is "failed".

Could you tell where we should set? Also the above link does not a web page available.

avatar
Master Mentor
@Shigeru Takehara

in your job.properties file, I am not intimately familiar with HUE Oozie editor and cannot comment for sure.

avatar
Expert Contributor

I got the following error: I used your sqoopshell example.

Stdoutput I AM: yarn

Stdoutput Running On: sandbox.hortonworks.com

Stdoutput CWD: /hadoop/yarn/local/usercache/root/appcache/application_1454000721165_0004/container_e21_1454000721165_0004_01_000002

Stdoutput Can I see these clients? Hive = /usr/bin/hive, Sqoop = /usr/bin/sqoop

<<< Invocation of Main class completed <<<

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

avatar
Expert Contributor

Additional information:

I added the above sqoop working command in the script.sh. I can see it works, but it fails at sqoop import.

avatar
Expert Contributor

When I ran sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --table A_BASE --verbose --target-dir /user/root/test1 -m 1,

I got an interesting log:

at org.apache.sqoop.Sqoop.main(Sqoop.java:244)

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=yarn, access=WRITE, inode="/user/yarn/.staging":hdfs:hdfs:drwxr-xr-x

avatar
Master Mentor

@Shigeru Takehara are you running Sqoop as user yarn? If not then its an expected situation. In my script I su as yarn to get that working

avatar
Expert Contributor

I ran:

sudo -u hdfs hdfs dfs -mkdir /user/yarn
sudo -u hdfs hdfs dfs -chown yarn:hdfs /user/yarn

I got:

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=root, access=EXECUTE, inode="/user/yarn/.staging/job_1454000721165_0009/libjars/commons-compress-1.4.1.jar":yarn:hdfs:drwx------

How should we do from here?

avatar
Master Mentor

@Shigeru Takehara do chown recursive chown -R