Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Type mismatch for Hadoop Conf for HBase

avatar
Rising Star

I have a Spark application that needs to retrieve data from HBase directly.

I provide:

import org.apache.spark._
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.spark.rdd.NewHadoopRDD

val conf = HBaseConfiguration.create()
conf.set(TableInputFormat.INPUT_TABLE, "timeseries")

with

val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], 
      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],
      classOf[org.apache.hadoop.hbase.client.Result])

I receive the error

<console>:113: error: type mismatch;
 found   : org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration
 required: org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.org.apache.hadoop.conf.Configuration
       val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], 
                                         ^

Which is a very confusing error. Am I missing an import or have the conf types wrong?

1 ACCEPTED SOLUTION

avatar
Rising Star

finally got to the bottom of this one

hbase-spark is not in maven central and the error was not "in my face".

Once I added the hbase-spark repo to maven everything works as expected.

Thanks for your quick replies

View solution in original post

5 REPLIES 5

avatar
Master Collaborator

Looks like you were following example using spark-shell.

Can you reproduce this using standalone program ?

Which HDP release are you using ?

avatar
Super Guru

@wsalazar

Can you please try with below imports?

import org.apache.spark._
import org.apache.spark.rdd.NewHadoopRDD
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.HTable;

Also if possible please share your code and command.

avatar
Rising Star

@Jitendra Yadav no change with the addtional imports

@Ted Yu I am converting this to a stand alone program as we speak, will post later.

This is HDP 2.4.2 with Spark 1.6.1

Just FYI I tried

val conf = sc.hadoopConfiguration

With success but this is essentially the same as the HbaseConfiguration and even receives the same type class. I was thinking these should be interchangeable. Perhaps that is not the case?

avatar
Master Collaborator

Please try

conf.asInstanceOf[Configuration]

avatar
Rising Star

finally got to the bottom of this one

hbase-spark is not in maven central and the error was not "in my face".

Once I added the hbase-spark repo to maven everything works as expected.

Thanks for your quick replies