Created 05-25-2016 04:09 PM
I have a Spark application that needs to retrieve data from HBase directly.
I provide:
import org.apache.spark._ import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.hbase.client.HBaseAdmin import org.apache.spark.rdd.NewHadoopRDD val conf = HBaseConfiguration.create() conf.set(TableInputFormat.INPUT_TABLE, "timeseries")
with
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
classOf[org.apache.hadoop.hbase.client.Result])I receive the error
<console>:113: error: type mismatch;
found : org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration
required: org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],
^
Which is a very confusing error. Am I missing an import or have the conf types wrong?
Created 05-25-2016 10:04 PM
finally got to the bottom of this one
hbase-spark is not in maven central and the error was not "in my face".
Once I added the hbase-spark repo to maven everything works as expected.
Thanks for your quick replies
Created 05-25-2016 04:15 PM
Looks like you were following example using spark-shell.
Can you reproduce this using standalone program ?
Which HDP release are you using ?
Created 05-25-2016 04:17 PM
Can you please try with below imports?
import org.apache.spark._
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.HTable;
Also if possible please share your code and command.
Created 05-25-2016 04:19 PM
@Jitendra Yadav no change with the addtional imports
@Ted Yu I am converting this to a stand alone program as we speak, will post later.
This is HDP 2.4.2 with Spark 1.6.1
Just FYI I tried
val conf = sc.hadoopConfiguration
With success but this is essentially the same as the HbaseConfiguration and even receives the same type class. I was thinking these should be interchangeable. Perhaps that is not the case?
Created 05-25-2016 04:28 PM
Please try
conf.asInstanceOf[Configuration]
Created 05-25-2016 10:04 PM
finally got to the bottom of this one
hbase-spark is not in maven central and the error was not "in my face".
Once I added the hbase-spark repo to maven everything works as expected.
Thanks for your quick replies