Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark Hbase connector for Spark 1.6.1 version

Solved Go to solution

Spark Hbase connector for Spark 1.6.1 version

New Contributor

We are planning to use Spark Hbase connector from HortonWorks for the new project.[https://github.com/hortonworks-spark/shc]

Since we are using HortonWorks 2.4.2, the supported Spark Version is 1.6.1

Can we use this Spark-Hbase connector jar in Spark 1.6.1?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark Hbase connector for Spark 1.6.1 version

@Sankaraiah Narayanasamy

That is supported. I am sure that you researched on using this connector. This article that points out the use of Spark 1.6.1 is supported, but it works practically with any version of Spark since 1.2.: http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/. The Github confirms the same. Look at the pom.xml: https://github.com/hortonworks-spark/shc/blob/master/pom.xml, properties section:

<properties>
<spark.version>1.6.1</spark.version>
<hbase.version>1.1.2</hbase.version>

Use the Spark-on-HBase connector as a standard Spark package.

+++

If the response was helpful, please vote and accept the best answer.

6 REPLIES 6

Re: Spark Hbase connector for Spark 1.6.1 version

@Sankaraiah Narayanasamy

That is supported. I am sure that you researched on using this connector. This article that points out the use of Spark 1.6.1 is supported, but it works practically with any version of Spark since 1.2.: http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/. The Github confirms the same. Look at the pom.xml: https://github.com/hortonworks-spark/shc/blob/master/pom.xml, properties section:

<properties>
<spark.version>1.6.1</spark.version>
<hbase.version>1.1.2</hbase.version>

Use the Spark-on-HBase connector as a standard Spark package.

+++

If the response was helpful, please vote and accept the best answer.

Highlighted

Re: Spark Hbase connector for Spark 1.6.1 version

New Contributor
@Constantin Stanca

: Can i use this as a Maven dependency? or i should use it as Standard Spark package? what is the difference? i never used Standard Spark package.

Re: Spark Hbase connector for Spark 1.6.1 version

@Sankaraiah Narayanasamy

To include Spark-on-HBase connector as a standard Spark package, in your Spark application use:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell –packages zhzhan:shc:0.0.11-1.6.1-s_2.10

You can also include the package as the dependency in your SBT file as well. The format is the spark-package-name:version

spDependencies += “zhzhan/shc:0.0.11-1.6.1-s_2.10”

You can also use it as a Maven dependency.

All options are possible.

Re: Spark Hbase connector for Spark 1.6.1 version

Contributor

I have followed the above steps and using the JAR zhzhan/shc:0.0.11-1.6.1-s_2.10” .

On executing the code ,i am getting the following expection

Exception in thread "main" java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.init(HBaseResources.scala:93) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.liftedTree1$1(HBaseResources.scala:57) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.acquire(HBaseResources.scala:54) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.acquire(HBaseResources.scala:88) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.releaseOnException(HBaseResources.scala:74) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.releaseOnException(HBaseResources.scala:88) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.<init>(HBaseResources.scala:108) at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:190) at org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$anonfun$org$apache$spark$sql$DataFrame$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.DataFrame$anonfun$org$apache$spark$sql$DataFrame$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$execute$1(DataFrame.scala:1537) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$collect(DataFrame.scala:1544) at org.apache.spark.sql.DataFrame$anonfun$head$1.apply(DataFrame.scala:1414) at org.apache.spark.sql.DataFrame$anonfun$head$1.apply(DataFrame.scala:1413) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138) at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1413) at org.apache.spark.sql.DataFrame.take(DataFrame.scala:1495) at org.apache.spark.sql.DataFrame.showString(DataFrame.scala:171) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:394) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:355) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:363) at com.sparhbaseintg.trnsfm.HBasesrc$.main(Hbasesrc.scala:83) at com.sparhbaseintg.trnsfm.HBasesrc.main(Hbasesrc.scala) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 44 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.RpcRetryingCallerFactory.instantiate(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hbase/client/ServerStatisticTracker;)Lorg/apache/hadoop/hbase/client/RpcRetryingCallerFactory; at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.createAsyncProcess(ConnectionManager.java:2242) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:690) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:630)

I can see the methods org.apache.hadoop.hbase.client.RpcRetryingCallerFactory.instantiate method in hbase client jar. I am not sure why its not referring it.

Please help .

Thanks!

Re: Spark Hbase connector for Spark 1.6.1 version

Super Guru

is hbase running? do you have a firewall blocking it?

what JDK are you using? perhaps an incompatible version? any other logs or details you can share?

Re: Spark Hbase connector for Spark 1.6.1 version

Contributor

Added hbase client JAR .Fixed the issue .

Thanks Timothy !

Don't have an account?
Coming from Hortonworks? Activate your account here