Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

ClassNotFoundException: org.apache.htrace.Trace exception in spark shell for HBase cdh5.4.1

avatar
New Contributor

Doing a simple test of hbase on CDH5.4.1:  

 

Added the required jars just in case to figure out why trace is missing.

 

spark-shell --master yarn-client --jars /opt/cloudera/parcels/CDH/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-protocol-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar

 

When spark starts up.  The jars load:

 

15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar with timestamp 1435825258057
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-server-1.0.0-cdh5.4.2.jar with timestamp 1435825258076
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-protocol-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-protocol-1.0.0-cdh5.4.2.jar with timestamp 1435825258095
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar with timestamp 1435825258096
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-common-1.0.0-cdh5.4.2.jar with timestamp 1435825258099
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar at http://192.166.1.145:46366/jars/htrace-core.

 

 

Running the following scala code with HBaseAdmin object:

 

 import org.apache.spark._

 import org.apache.spark.rdd.NewHadoopRDD

 import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}

 import org.apache.hadoop.hbase.client.HBaseAdmin

 import org.apache.hadoop.hbase.mapreduce.TableInputFormat

 import org.apache.hadoop.fs.Path;

 import org.apache.hadoop.hbase.HColumnDescriptor

 import org.apache.hadoop.hbase.util.Bytes

 import org.apache.hadoop.hbase.client.Put;

 import org.apache.hadoop.hbase.client.HTable;

 val conf = HBaseConfiguration.create()

conf.addResource(new Path("/etc/hbase/conf/core-site.xml"))

conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))

val admin = new HBaseAdmin(conf)

 

 

 

java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:414)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:407)
at org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal(ConnectionManager.java:285)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:207)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:48)
at $iwC$$iwC$$iwC.<init>(<console>:50)
at $iwC$$iwC.<init>(<console>:52)
at $iwC.<init>(<console>:54)
at <init>(<console>:56)
at .<init>(<console>:60)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 52 more
Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635)
... 57 more
Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace

 

I also refered this post 

 

http://community.cloudera.com/t5/Apache-Hadoop-Concepts-and/Class-not-found-running-Spark-sample-hba...

 

that configuration change didnt make a difference either. 

 

 

Any ideas?

 

Michael 

 

 

 

1 ACCEPTED SOLUTION

avatar
New Contributor

The spark-env.sh 

 

At the end of this script you find:

 

# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")

 

I opened up the classpath.txt and just added the /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar to the end of the file.

 

 

executing "/opt/cloudera/parcels/CDH/lib/spark/bin/run-example HBaseTest test2" now works.

 

View solution in original post

5 REPLIES 5

avatar
New Contributor

I was able to run something similar on MapR cluster.

 

I did some digging around and i valided the issue with running the example HBaseTest on cloudera versus MapR.  I also noticed that Cloudera 5.4.1 is using HBase 1 versus 0.98. HBase. 

 

There some significant client api changes in HBase.  Not sure if htis is the reason.

avatar
New Contributor

The spark-env.sh 

 

At the end of this script you find:

 

# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")

 

I opened up the classpath.txt and just added the /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar to the end of the file.

 

 

executing "/opt/cloudera/parcels/CDH/lib/spark/bin/run-example HBaseTest test2" now works.

 

avatar
Super Collaborator

You need to always provide your own dependencies for your application. There is no dependency on HBase in Spark and the fact that some of the HBase jars are pulled in due to being part of a Hive dependency which Spark has is a coincidence.

If you build an application then you should always make sure that you resolve your own dependencies.

 

It might have worked in previous versions or in a distribution from a different provider out of the box becuase the Spark version had different dependencies.

BTW: you should be using the spark.[driver|executor].extraClassPath settings as that is the current way to do this.

 

Wilfred

avatar
Contributor

It worked for me.. Thanks

avatar
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
--conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \

work for me