Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does param "--skip-dist-cache" in sqoop command called by workflow make the class not found ??

Highlighted

Does param "--skip-dist-cache" in sqoop command called by workflow make the class not found ??

New Contributor

I made a sqoop test just loading data from mysql to hive in ambari workflow.And the running result made me puzzled.

First: I initialized the oozie sharelib. My sharelib located on hdfs://flx87:8020/user/oozie/share/lib/lib_20180608191050

>hadoop fs -mkdir /user/oozie/share/lib/lib_20180608191050/conf
>hadoop fs -mkdir /user/oozie/share/lib/lib_20180608191050/extlib
>hadoop fs -copyFromLocal /path/to/jdbcDriver.jar /user/oozie/share/lib/lib_20180608191050/extlib/
>hadoop fs -copyFromLocal /etc/hadoop/conf/core-site.xml /etc/hadoop/conf/hdfs-site.xml /etc/hadoop/conf/yarn-site.xml /etc/hadoop/conf/mapred-site.xml /etc/sqoop/conf/sqoop-site.xml /etc/hive/conf/hive-site.xml /etc/tez/conf/tez-site.xml /etc/oozie/conf/oozie-site.xml /user/oozie/share/lib/lib_20180608191050/conf/

Second: I used workflow in ambari to create a sqoop node. tested four times with four different params and get four different results. Of course when I submitted the workflow.I set the custom job properties oozie.action.sharelib.for.sqoop = sqoop,hive,hcatalog,conf,extlib

>1st time : use hive-import

import --connect jdbc:mysql://flx93:3306/test --username test --password-file /user/hdfs/sqoopTestPasswd --table t_data --hive-database test --hive-table t_data --hive-import --hive-overwrite 
# Everything goes fine

>2nd time : use hive-import and add skip-dist-cache

import --connect jdbc:mysql://flx93:3306/test --username test --password-file /user/hdfs/sqoopTestPasswd --table t_data --hive-database test --hive-table t_data --hive-import --hive-overwrite --skip-dist-cache 
## Error occurred. RawKeyTextOutputFormat not found
2018-07-29 01:07:11,732 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2018-07-29 01:07:11,733 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2018-07-29 01:07:11,777 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.sqoop.mapreduce.RawKeyTextOutputFormat not found
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.sqoop.mapreduce.RawKeyTextOutputFormat not found
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:521)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:501)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1640)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:501)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:287)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1598)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1595)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1526)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.sqoop.mapreduce.RawKeyTextOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2308)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:223)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:518)
... 11 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.sqoop.mapreduce.RawKeyTextOutputFormat not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2214)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2306)
... 13 more
</p>

>3rd time : use hcatalog

import --connect jdbc:mysql://flx93:3306/test --username test --password-file /user/hdfs/sqoopTestPasswd --table t_data --hcatalog-database test --hcatalog-table t_data 
## Error occurred.
ERROR org.apache.sqoop.tool.ImportTool  - Imported Failed: Can not create a Path from an empty string

>4th time : use hcatalog and add skip-dist-cache

import --connect jdbc:mysql://flx93:3306/test --username test --password-file /user/hdfs/sqoopTestPasswd --table t_data --hcatalog-database test --hcatalog-table t_data --skip-dist-cache
## Error occurred. HCatOutputFormat not found
2018-07-29 01:30:17,367 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Using mapred newApiCommitter.
2018-07-29 01:30:17,369 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter set in config null
2018-07-29 01:30:17,571 INFO [main] org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.mapreduce.v2.app.MRAppMaster failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:521)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:501)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.callWithJobClassLoader(MRAppMaster.java:1640)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.createOutputCommitter(MRAppMaster.java:501)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceInit(MRAppMaster.java:287)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$5.run(MRAppMaster.java:1598)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1595)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1526)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2308)
        at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:223)
        at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$2.call(MRAppMaster.java:518)
        ... 11 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.mapreduce.HCatOutputFormat not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2214)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2306)
        ... 13 more

So does skip-dist-cache make the class not found ??

Please help me to figure that out .

Don't have an account?
Coming from Hortonworks? Activate your account here