- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark Hbase connector for Spark 1.6.1 version
- Labels:
-
Apache HBase
-
Apache Spark
Created 10-17-2016 03:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are planning to use Spark Hbase connector from HortonWorks for the new project.[https://github.com/hortonworks-spark/shc]
Since we are using HortonWorks 2.4.2, the supported Spark Version is 1.6.1
Can we use this Spark-Hbase connector jar in Spark 1.6.1?
Created 10-18-2016 03:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is supported. I am sure that you researched on using this connector. This article that points out the use of Spark 1.6.1 is supported, but it works practically with any version of Spark since 1.2.: http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/. The Github confirms the same. Look at the pom.xml: https://github.com/hortonworks-spark/shc/blob/master/pom.xml, properties section:
<properties> |
<spark.version>1.6.1</spark.version> |
<hbase.version>1.1.2</hbase.version> |
Use the Spark-on-HBase connector as a standard Spark package.
+++
If the response was helpful, please vote and accept the best answer.
Created 10-18-2016 03:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is supported. I am sure that you researched on using this connector. This article that points out the use of Spark 1.6.1 is supported, but it works practically with any version of Spark since 1.2.: http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/. The Github confirms the same. Look at the pom.xml: https://github.com/hortonworks-spark/shc/blob/master/pom.xml, properties section:
<properties> |
<spark.version>1.6.1</spark.version> |
<hbase.version>1.1.2</hbase.version> |
Use the Spark-on-HBase connector as a standard Spark package.
+++
If the response was helpful, please vote and accept the best answer.
Created 10-18-2016 05:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
: Can i use this as a Maven dependency? or i should use it as Standard Spark package? what is the difference? i never used Standard Spark package.
Created 10-18-2016 02:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To include Spark-on-HBase connector as a standard Spark package, in your Spark application use:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell –packages zhzhan:shc:0.0.11-1.6.1-s_2.10
You can also include the package as the dependency in your SBT file as well. The format is the spark-package-name:version
spDependencies += “zhzhan/shc:0.0.11-1.6.1-s_2.10”
You can also use it as a Maven dependency.
All options are possible.
Created 10-24-2016 10:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have followed the above steps and using the JAR zhzhan/shc:0.0.11-1.6.1-s_2.10” .
On executing the code ,i am getting the following expection
Exception in thread "main" java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.init(HBaseResources.scala:93) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.liftedTree1$1(HBaseResources.scala:57) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.acquire(HBaseResources.scala:54) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.acquire(HBaseResources.scala:88) at org.apache.spark.sql.execution.datasources.hbase.ReferencedResource$class.releaseOnException(HBaseResources.scala:74) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.releaseOnException(HBaseResources.scala:88) at org.apache.spark.sql.execution.datasources.hbase.RegionResource.<init>(HBaseResources.scala:108) at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD.getPartitions(HBaseTableScan.scala:60) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:190) at org.apache.spark.sql.execution.Limit.executeCollect(basicOperators.scala:165) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$anonfun$org$apache$spark$sql$DataFrame$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.DataFrame$anonfun$org$apache$spark$sql$DataFrame$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$execute$1(DataFrame.scala:1537) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$collect(DataFrame.scala:1544) at org.apache.spark.sql.DataFrame$anonfun$head$1.apply(DataFrame.scala:1414) at org.apache.spark.sql.DataFrame$anonfun$head$1.apply(DataFrame.scala:1413) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138) at org.apache.spark.sql.DataFrame.head(DataFrame.scala:1413) at org.apache.spark.sql.DataFrame.take(DataFrame.scala:1495) at org.apache.spark.sql.DataFrame.showString(DataFrame.scala:171) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:394) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:355) at org.apache.spark.sql.DataFrame.show(DataFrame.scala:363) at com.sparhbaseintg.trnsfm.HBasesrc$.main(Hbasesrc.scala:83) at com.sparhbaseintg.trnsfm.HBasesrc.main(Hbasesrc.scala) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 44 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.RpcRetryingCallerFactory.instantiate(Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/hbase/client/ServerStatisticTracker;)Lorg/apache/hadoop/hbase/client/RpcRetryingCallerFactory; at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.createAsyncProcess(ConnectionManager.java:2242) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:690) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:630)
I can see the methods org.apache.hadoop.hbase.client.RpcRetryingCallerFactory.instantiate method in hbase client jar. I am not sure why its not referring it.
Please help .
Thanks!
Created 10-24-2016 11:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
is hbase running? do you have a firewall blocking it?
what JDK are you using? perhaps an incompatible version? any other logs or details you can share?
Created 10-25-2016 03:09 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Added hbase client JAR .Fixed the issue .
Thanks Timothy !
