Created 05-25-2016 04:09 PM
I have a Spark application that needs to retrieve data from HBase directly.
I provide:
import org.apache.spark._ import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.hbase.client.HBaseAdmin import org.apache.spark.rdd.NewHadoopRDD val conf = HBaseConfiguration.create() conf.set(TableInputFormat.INPUT_TABLE, "timeseries")
with
val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result])
I receive the error
<console>:113: error: type mismatch; found : org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration required: org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], ^
Which is a very confusing error. Am I missing an import or have the conf types wrong?
Created 05-25-2016 10:04 PM
finally got to the bottom of this one
hbase-spark is not in maven central and the error was not "in my face".
Once I added the hbase-spark repo to maven everything works as expected.
Thanks for your quick replies
Created 05-25-2016 04:15 PM
Looks like you were following example using spark-shell.
Can you reproduce this using standalone program ?
Which HDP release are you using ?
Created 05-25-2016 04:17 PM
Can you please try with below imports?
import org.apache.spark._ import org.apache.spark.rdd.NewHadoopRDD import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor} import org.apache.hadoop.hbase.client.HBaseAdmin import org.apache.hadoop.hbase.mapreduce.TableInputFormat import org.apache.hadoop.fs.Path; import org.apache.hadoop.hbase.HColumnDescriptor import org.apache.hadoop.hbase.util.Bytes import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.HTable;
Also if possible please share your code and command.
Created 05-25-2016 04:19 PM
@Jitendra Yadav no change with the addtional imports
@Ted Yu I am converting this to a stand alone program as we speak, will post later.
This is HDP 2.4.2 with Spark 1.6.1
Just FYI I tried
val conf = sc.hadoopConfiguration
With success but this is essentially the same as the HbaseConfiguration and even receives the same type class. I was thinking these should be interchangeable. Perhaps that is not the case?
Created 05-25-2016 04:28 PM
Please try
conf.asInstanceOf[Configuration]
Created 05-25-2016 10:04 PM
finally got to the bottom of this one
hbase-spark is not in maven central and the error was not "in my face".
Once I added the hbase-spark repo to maven everything works as expected.
Thanks for your quick replies