Created 02-05-2018 01:46 PM
Dear All,
I have following sqoop job which uses hcatalog import as orc file. This jobs runs fine from console but does not run from oozie
sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true \ -Dhadoop.security.credential.provider.path=jceks://hdfs/******.jceks \ --connect jdbc:sqlserver://abc.domain.com \ --username <username> \ --password-alias alias.password \ --table <tablename> \ --hcatalog-database <hive database> \ --hcatalog-table <hive table> \ --hcatalog-storage-stanza "stored as orcfile"
<?xml version="1.0" encoding="UTF-8"?> <workflow-app xmlns="uri:oozie:workflow:0.5" name="OZ_proj_Load"> <global> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> </global> <start to="sqoopTableLoad"/> <action name="sqoopTableLoad"> <sqoop xmlns="uri:oozie:sqoop-action:0.4"> <arg>import</arg> <arg>-Dorg.apache.sqoop.splitter.allow_text_splitter=true</arg> <arg>-Dhadoop.security.credential.provider.path=jceks://hdfs//******.jceks</arg> <arg>--connect</arg> <arg>jdbc:sqlserver://abc.domain.com</arg> <arg>--username</arg> <arg>username</arg> <arg>--password-alias</arg> <arg>alias.password</arg> <arg>--table</arg> <arg>tablename</arg> <arg>--hcatalog-database</arg> <arg>hivedatabase</arg> <arg>--hcatalog-table</arg> <arg>hivetable</arg> <arg>--hcatalog-storage-stanza</arg> <arg>"stored as orcfile"</arg> </sqoop> <ok to="end"/> <error to="killSqoopLoad"/> </action> <kill name="killSqoopLoad"> <message>The workflow failed at Sqoop, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
I receive the following error. Any help is appreciated.
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], main() threw exception, org/apache/hive/hcatalog/mapreduce/HCatOutputFormat java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/mapreduce/HCatOutputFormat at org.apache.sqoop.mapreduce.DataDrivenImportJob.getOutputFormatClass(DataDrivenImportJob.java:199) at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98) at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:263) at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692) at org.apache.sqoop.manager.SQLServerManager.importTable(SQLServerManager.java:163) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615) at org.apache.sqoop.Sqoop.run(Sqoop.java:147) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234) at org.apache.sqoop.Sqoop.main(Sqoop.java:243) at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197) at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:179) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58) at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:237) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.mapreduce.HCatOutputFormat at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
Created 02-05-2018 02:46 PM
The HCatOutputFormat class is in jar hive-hcatalog-core.jar . try to export the jar to HADOOP_CLASSPATH
take look to this link for more information https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput
Created 02-05-2018 04:34 PM
Dear Reda,
I see the jar present in my oozie share lib /user/bdd01oozie/share/lib/lib_20170829184616/hcatalog/hive-hcatalog-core-1.2.1000.2.6.1.0-129.jar
So it should work as i am using the the properties :
oozie.use.system.libpath=true
oozie.libpath=${nameNode}/user/bdd01oozie/share/lib
Do i still need to add the jar to HADOOP_CLASSPATH?
Thanks and Best Regards,
Gagan
Created 07-26-2018 08:37 PM
I'm not sure if you got the solution for this or not. For other users who are facing same issues, you need to add below property in your workflow to tell Oozie to load required sharelib for sqoop action.
<property> <name>oozie.action.sharelib.for.sqoop</name> <value>sqoop,hive,hcatalog</value> </property>