Created on 07-02-2015 01:40 AM - edited 09-16-2022 02:33 AM
Doing a simple test of hbase on CDH5.4.1:
Added the required jars just in case to figure out why trace is missing.
spark-shell --master yarn-client --jars /opt/cloudera/parcels/CDH/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-protocol-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.4.2.jar,/opt/cloudera/parcels/CDH/jars/htrace-core-3.1.0-incubating.jar
When spark starts up. The jars load:
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hive-hbase-handler-1.1.0-cdh5.4.2.jar with timestamp 1435825258057
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-server-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-server-1.0.0-cdh5.4.2.jar with timestamp 1435825258076
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-protocol-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-protocol-1.0.0-cdh5.4.2.jar with timestamp 1435825258095
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-hadoop2-compat-1.0.0-cdh5.4.2.jar with timestamp 1435825258096
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/jars/hbase-common-1.0.0-cdh5.4.2.jar at http://192.166.1.145:46366/jars/hbase-common-1.0.0-cdh5.4.2.jar with timestamp 1435825258099
15/07/02 01:20:58 INFO SparkContext: Added JAR file:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar at http://192.166.1.145:46366/jars/htrace-core.
Running the following scala code with HBaseAdmin object:
import org.apache.spark._
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.HTable;
val conf = HBaseConfiguration.create()
conf.addResource(new Path("/etc/hbase/conf/core-site.xml"))
conf.addResource(new Path("/etc/hbase/conf/hbase-site.xml"))
val admin = new HBaseAdmin(conf)
java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:414)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:407)
at org.apache.hadoop.hbase.client.ConnectionManager.getConnectionInternal(ConnectionManager.java:285)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:207)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:33)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:38)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:42)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:44)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:46)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:48)
at $iwC$$iwC$$iwC.<init>(<console>:50)
at $iwC$$iwC.<init>(<console>:52)
at $iwC.<init>(<console>:54)
at <init>(<console>:56)
at .<init>(<console>:60)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:856)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:901)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:813)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:656)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:664)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:669)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:996)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:944)
at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:944)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1058)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 52 more
Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635)
... 57 more
Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace
I also refered this post
that configuration change didnt make a difference either.
Any ideas?
Michael
Created on 07-04-2015 11:55 PM - edited 07-04-2015 11:56 PM
The spark-env.sh
At the end of this script you find:
# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")
I opened up the classpath.txt and just added the /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar to the end of the file.
executing "/opt/cloudera/parcels/CDH/lib/spark/bin/run-example HBaseTest test2" now works.
Created 07-03-2015 08:08 PM
I was able to run something similar on MapR cluster.
I did some digging around and i valided the issue with running the example HBaseTest on cloudera versus MapR. I also noticed that Cloudera 5.4.1 is using HBase 1 versus 0.98. HBase.
There some significant client api changes in HBase. Not sure if htis is the reason.
Created on 07-04-2015 11:55 PM - edited 07-04-2015 11:56 PM
The spark-env.sh
At the end of this script you find:
# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")
I opened up the classpath.txt and just added the /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar to the end of the file.
executing "/opt/cloudera/parcels/CDH/lib/spark/bin/run-example HBaseTest test2" now works.
Created 07-08-2015 12:43 AM
You need to always provide your own dependencies for your application. There is no dependency on HBase in Spark and the fact that some of the HBase jars are pulled in due to being part of a Hive dependency which Spark has is a coincidence.
If you build an application then you should always make sure that you resolve your own dependencies.
It might have worked in previous versions or in a distribution from a different provider out of the box becuase the Spark version had different dependencies.
BTW: you should be using the spark.[driver|executor].extraClassPath settings as that is the current way to do this.
Wilfred
Created 06-21-2016 01:53 AM
It worked for me.. Thanks
Created on 02-14-2017 01:29 AM - edited 02-14-2017 01:30 AM
--conf "spark.driver.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.executor.extraClassPath=/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/jars/htrace-core-3.1.0-incubating.jar:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/conf:/opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/lib/hive/lib/*.jar" \
--conf "spark.driver.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
--conf "spark.executor.extraJavaOptions=-XX:MaxPermSize=1024m -XX:PermSize=256m" \
work for me