Support Questions

Find answers, ask questions, and share your expertise

Oozie Sqoop Action works for list-databases but does not work for import

avatar
Expert Contributor

I'm using HDP 2.3.2 sandbox. To make sqoop from oozie run, I needed:

- set port forwarding for 10020 in sandbox to remove connection refuse issue

- copy oracle jdbc drive to HDFS /user/oozie/share/lib/lib_20151027124452/sqoop directory

Now, I can run sqoop list-tables or list-databases command; however, sqoop import fails with both HDFS and Hive.

Working command:

list-tables --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11

Not Working command:

sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --table A_BASE --verbose --target-dir /user/hue/test1 -m 1

sqoop import --connect jdbc:oracle:thin:@//11.111.11.111:1521/dev11 --password m11 --username m11 --hive-table A_BASE --table A_BASE --verbose --hive-import -m 1

These commands all work from a SSH command line.

The error message is: Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]

I wish we could see the detail debug information for this error return.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This is my summary:

1. download HDP sandbox

2. create a worflow that does list-databases and run it from Hue. We expect the failure.

3. To see the error log, localhost:19888/jobhistory/?user.name=hue, click job Id link, click value 1 link for Maps at the bottom of tables. In the new web page, click logs link. You should see java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver.

4. open SSH shell as root. Copy oracle jdbc jar to /home/oozie from your local file system.

5. run su - oozie

6. run hdfs dfs -put ojdbc7-12.1.0.1.0.jar /user/oozie/share/lib/lib_20151027124452/sqoop

7. check if oracle jdbc jar is copied or not from Ambari. Use HDFS Files viewer from clicking the second icon from the right side on the Ambari page. Navigate it /user/oozie/share/lib/lib_20151027124452/sqoop.

8. Restart sandbox.

9. Run the workflow created at step 2.

10. It should work.

11. create another workflow that does import Hive and run it from Hue.

12. You see the warning message: 2016-01-28 21:10:23,398 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. Intercepting System.exit(1)

13. Need the followings two things:

(1) add oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop >> To do this from Hue oozie workflow web page, click "Edit properties" link and add this property to Oozie parameters. The oozie.use.system.libpath is already set true for default. Just add it.

(2) copy hive-site.xml to lib dir >> To do this from Hue, click File Browser, click oozie that takes you to /user/hue/oozie. Click workspaces, click _hue_xxx directory for the current workflow, create lib directory in the identified directory, copy hive-site.xml file that contains something like: (you don't need to update jdbc connection string of yours; it looks it needs hive.metastore.uris value; so maybe you can delete the first property, which I have not tried yet.)

<configuration>

<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value> <description>JDBC connect string for a JDBC metastore</description>

</property>

<property> <name>hive.metastore.uris</name> <value>thrift://sandbox.hortonworks.com:9083</value>

</property>

</configuration>

Once this file is created, go back to the oozie workflow editor, and click Add Path button and select hive-site.xml. Save it.

14. Run the workflow. It should run.

View solution in original post

17 REPLIES 17

avatar
Expert Contributor

Since whenever we submit the workflow, it will create a new directory, it does not work.

I did sudo -u yarn with sqoop import. I does not return code 1 any more, which is good, but neither HDFS file nor Hive table is created.

Any idea?

avatar
Expert Contributor

Since I could see the detail error message from http://localhost:19888/jobhistory/?user.name=hue, I could identify what's wrong. it was due to a temp dir for A_BASE; after I delete this, it works, but since so many things I touched, I will do from the scratch and I will comment my summary here.

Thanks, Artem for this with me.

avatar
Master Mentor

excellent work @Shigeru Takehara

avatar
Expert Contributor

This is my summary:

1. download HDP sandbox

2. create a worflow that does list-databases and run it from Hue. We expect the failure.

3. To see the error log, localhost:19888/jobhistory/?user.name=hue, click job Id link, click value 1 link for Maps at the bottom of tables. In the new web page, click logs link. You should see java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver.

4. open SSH shell as root. Copy oracle jdbc jar to /home/oozie from your local file system.

5. run su - oozie

6. run hdfs dfs -put ojdbc7-12.1.0.1.0.jar /user/oozie/share/lib/lib_20151027124452/sqoop

7. check if oracle jdbc jar is copied or not from Ambari. Use HDFS Files viewer from clicking the second icon from the right side on the Ambari page. Navigate it /user/oozie/share/lib/lib_20151027124452/sqoop.

8. Restart sandbox.

9. Run the workflow created at step 2.

10. It should work.

11. create another workflow that does import Hive and run it from Hue.

12. You see the warning message: 2016-01-28 21:10:23,398 WARN [main] tool.SqoopTool (SqoopTool.java:loadPluginsFromConfDir(177)) - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. Intercepting System.exit(1)

13. Need the followings two things:

(1) add oozie.action.sharelib.for.sqoop=hive,hcatalog,sqoop >> To do this from Hue oozie workflow web page, click "Edit properties" link and add this property to Oozie parameters. The oozie.use.system.libpath is already set true for default. Just add it.

(2) copy hive-site.xml to lib dir >> To do this from Hue, click File Browser, click oozie that takes you to /user/hue/oozie. Click workspaces, click _hue_xxx directory for the current workflow, create lib directory in the identified directory, copy hive-site.xml file that contains something like: (you don't need to update jdbc connection string of yours; it looks it needs hive.metastore.uris value; so maybe you can delete the first property, which I have not tried yet.)

<configuration>

<property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:derby:;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value> <description>JDBC connect string for a JDBC metastore</description>

</property>

<property> <name>hive.metastore.uris</name> <value>thrift://sandbox.hortonworks.com:9083</value>

</property>

</configuration>

Once this file is created, go back to the oozie workflow editor, and click Add Path button and select hive-site.xml. Save it.

14. Run the workflow. It should run.

avatar
Master Mentor

@Shigeru Takehara excellent, glad it worked!

avatar
Expert Contributor

Thank you, Artem.

avatar
Master Mentor

please accept the best answer and I recommend you take your steps and create an article instead of question on this site, it will be helpful for everyone and you will earn reputation! @Shigeru Takehara

avatar
Expert Contributor

When I added Hive action before the Sqoop action, I got hive-site.xml file permission error. To avoid this problem, delete lib/hive-site.xml file, and add this file to inside of Sqoop action as a file.