Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Oozie spark job java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf

Oozie spark job java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf

The spark batch job worked well when submitting with a shell  cmd.

 

spark-submit --class com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive \
--master yarn-cluster \
--num-executors 5 \
--driver-memory 3g \
--executor-memory 3g \
--executor-cores 1 \
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
--conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
spark_demo-1.0-SNAPSHOT-shaded.jar 20170219

 

But I use oozie to submit spark batch job,exception happen.Here is log:

 

 

 

17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode3:8042/node/containerlogs/container_1487752257960_0334_02_000006/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode3:8042/node/containerlogs/container_1487752257960_0334_02_000006/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://namenode:8042/node/containerlogs/container_1487752257960_0334_02_000003/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://namenode:8042/node/containerlogs/container_1487752257960_0334_02_000003/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode:8042/node/containerlogs/container_1487752257960_0334_02_000004/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode:8042/node/containerlogs/container_1487752257960_0334_02_000004/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode2:8042/node/containerlogs/container_1487752257960_0334_02_000002/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode2:8042/node/containerlogs/container_1487752257960_0334_02_000002/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> /opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar<CPS>{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*, SPARK_LOG_URL_STDERR -> http://datanode0:8042/node/containerlogs/container_1487752257960_0334_02_000005/zhuj/stderr?start=0, SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1487752257960_0334, SPARK_YARN_CACHE_FILES_FILE_SIZES -> 98343005,176766064, SPARK_USER -> zhuj, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PUBLIC, SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1487832656812,1487815532194, SPARK_LOG_URL_STDOUT -> http://datanode0:8042/node/containerlogs/container_1487752257960_0334_02_000005/zhuj/stdout?start=0, SPARK_YARN_CACHE_FILES -> hdfs://nameservice/user/zhuj/.sparkStaging/application_1487752257960_0334/spark-assembly.jar#__spark__.jar,hdfs://nameservice/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar#__app__.jar)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 1, --hostname, datanode2, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 5, --hostname, datanode3, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 3, --hostname, datanode, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 2, --hostname, namenode, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO yarn.ExecutorRunnable: Setting up executor with commands: List({{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', -Xms3072m, -Xmx3072m, '-XX:PermSize=1024m', -Djava.io.tmpdir={{PWD}}/tmp, '-Dspark.driver.port=40826', '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, akka.tcp://sparkDriver@datanode3:40826/user/CoarseGrainedScheduler, --executor-id, 4, --hostname, datanode0, --cores, 1, --app-id, application_1487752257960_0334, --user-class-path, file:$PWD/__app__.jar, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr)
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : namenode:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode2:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode3:8041
17/02/23 14:51:46 INFO impl.ContainerManagementProtocolProxy: Opening proxy : datanode0:8041
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode:36494/user/Executor#311844886] with ID 3
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode:40821 with 1589.8 MB RAM, BlockManagerId(3, datanode, 40821)
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode0:44241/user/Executor#-1406859909] with ID 4
17/02/23 14:51:49 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode2:35227/user/Executor#66771502] with ID 1
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode0:34517 with 1589.8 MB RAM, BlockManagerId(4, datanode0, 34517)
17/02/23 14:51:49 INFO storage.BlockManagerMasterActor: Registering block manager datanode2:59608 with 1589.8 MB RAM, BlockManagerId(1, datanode2, 59608)
17/02/23 14:51:50 INFO cluster.YarnClusterSchedulerBackend: Registered executor: Actor[akka.tcp://sparkExecutor@datanode3:40349/user/Executor#1475870089] with ID 5
17/02/23 14:51:50 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
17/02/23 14:51:50 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
17/02/23 14:51:50 ERROR yarn.ApplicationMaster: User class threw exception: org/apache/hadoop/hive/conf/HiveConf
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf
        at com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive$.main(SmsStatBy3DayDrive.scala:87)
        at com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive.main(SmsStatBy3DayDrive.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

 

 

How to set the spark extra options ? My oozie job.xml is here:

 

<workflow-app name="SmsStatBy3DayDrive" xmlns="uri:oozie:workflow:0.5">
  <global>
            <configuration>
                <property>
                    <name></name>
                    <value></value>
                </property>
            </configuration>
  </global>
    <start to="spark-3b65"/>
    <kill name="Kill">
        <message>操作失败,错误消息[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="spark-3b65">
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <master>yarn-cluster</master>
            <mode>cluster</mode>
            <name>SmsStatBy3DayDrive</name>
              <class>com.raiyi.spark.smscount.batch.SmsStatBy3DayDrive</class>
            <jar>${nameNode}/user/zhuj/batchjars/spark_demo-1.0-SNAPSHOT-shaded.jar</jar>
              <spark-opts>--num-executors 5 --driver-memory 3g --executor-memory 3g --executor-cores 1 --conf spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar --conf spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar --conf spark.driver.extraJavaOptions=-XX:PermSize=1024m --conf spark.executor.extraJavaOptions=-XX:PermSize=1024m</spark-opts>
              <arg>$(executeDate)</arg>
        </spark>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

Please help me !

 

2 REPLIES 2

Re: Oozie spark job java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf

CDH verison is 5.4.7

 

oozie 0.5

 

spark 1.3.0

 

 

Re: Oozie spark job java.lang.NoClassDefFoundError: org/apache/hadoop/hive/conf/HiveConf

The reason is this --conf not work.

spark.driver.extraClassPath

 how to figure it out ?

Don't have an account?
Coming from Hortonworks? Activate your account here