Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Use hcataloge from Sqoop in Oozie

Highlighted

Use hcataloge from Sqoop in Oozie

New Contributor

Dear All,

I have following sqoop job which uses hcatalog import as orc file. This jobs runs fine from console but does not run from oozie

sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true \ -Dhadoop.security.credential.provider.path=jceks://hdfs/******.jceks \ --connect jdbc:sqlserver://abc.domain.com \ --username <username> \ --password-alias alias.password \ --table <tablename> \ --hcatalog-database <hive database> \ --hcatalog-table <hive table> \ --hcatalog-storage-stanza "stored as orcfile"

<?xml version="1.0" encoding="UTF-8"?>


<workflow-app xmlns="uri:oozie:workflow:0.5" name="OZ_proj_Load">
  <global>
    <job-tracker>${jobTracker}</job-tracker>
    <name-node>${nameNode}</name-node>
  </global>
  <start to="sqoopTableLoad"/>
  <action name="sqoopTableLoad">
    <sqoop xmlns="uri:oozie:sqoop-action:0.4">
      <arg>import</arg>
      <arg>-Dorg.apache.sqoop.splitter.allow_text_splitter=true</arg>
      <arg>-Dhadoop.security.credential.provider.path=jceks://hdfs//******.jceks</arg>
      <arg>--connect</arg>
      <arg>jdbc:sqlserver://abc.domain.com</arg>
      <arg>--username</arg>
      <arg>username</arg>
      <arg>--password-alias</arg>
      <arg>alias.password</arg>
      <arg>--table</arg>
      <arg>tablename</arg>
      <arg>--hcatalog-database</arg>
      <arg>hivedatabase</arg>
      <arg>--hcatalog-table</arg>
      <arg>hivetable</arg>
      <arg>--hcatalog-storage-stanza</arg>
      <arg>"stored as orcfile"</arg>
    </sqoop>
    <ok to="end"/>
    <error to="killSqoopLoad"/>
  </action>
  <kill name="killSqoopLoad">
    <message>The workflow failed at Sqoop, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end"/>
</workflow-app>

I receive the following error. Any help is appreciated.

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], main() threw exception, org/apache/hive/hcatalog/mapreduce/HCatOutputFormat
java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/mapreduce/HCatOutputFormat
	at org.apache.sqoop.mapreduce.DataDrivenImportJob.getOutputFormatClass(DataDrivenImportJob.java:199)
	at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:263)
	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
	at org.apache.sqoop.manager.SQLServerManager.importTable(SQLServerManager.java:163)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
	at org.apache.oozie.action.hadoop.SqoopMain.runSqoopJob(SqoopMain.java:197)
	at org.apache.oozie.action.hadoop.SqoopMain.run(SqoopMain.java:179)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:58)
	at org.apache.oozie.action.hadoop.SqoopMain.main(SqoopMain.java:48)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:237)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164)
Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.mapreduce.HCatOutputFormat
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
3 REPLIES 3

Re: Use hcataloge from Sqoop in Oozie

New Contributor

The HCatOutputFormat class is in jar hive-hcatalog-core.jar . try to export the jar to HADOOP_CLASSPATH

take look to this link for more information https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput

Re: Use hcataloge from Sqoop in Oozie

New Contributor

Dear Reda,

I see the jar present in my oozie share lib /user/bdd01oozie/share/lib/lib_20170829184616/hcatalog/hive-hcatalog-core-1.2.1000.2.6.1.0-129.jar

So it should work as i am using the the properties :

oozie.use.system.libpath=true

oozie.libpath=${nameNode}/user/bdd01oozie/share/lib

Do i still need to add the jar to HADOOP_CLASSPATH?

Thanks and Best Regards,

Gagan

Re: Use hcataloge from Sqoop in Oozie

Super Guru

I'm not sure if you got the solution for this or not. For other users who are facing same issues, you need to add below property in your workflow to tell Oozie to load required sharelib for sqoop action.

<property>
<name>oozie.action.sharelib.for.sqoop</name> 
<value>sqoop,hive,hcatalog</value> 
</property>
Don't have an account?
Coming from Hortonworks? Activate your account here