Support Questions
Find answers, ask questions, and share your expertise

Error while reading hbase table through spark dataframe

New Contributor

I am getting the below error while reading hbase table through spark dataframe:

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/spark/datasources/HBaseTableCatalog$ at com.spark.hbase.integration.Spark_hbase_integration$.withCatalog$1(Spark_hbase_integration.scala:46) at com.spark.hbase.integration.Spark_hbase_integration$.main(Spark_hbase_integration.scala:54) at com.spark.hbase.integration.Spark_hbase_integration.main(Spark_hbase_integration.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.spark.datasources.HBaseTableCatalog$ at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 12 more

 

Below is the Sbt dependency:

name := "untitled1"

version := "0.1"

scalaVersion := "2.11.11"

// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"

// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"


// https://mvnrepository.com/artifact/org.apache.hbase.connectors.spark/hbase-spark
libraryDependencies += "org.apache.hbase.connectors.spark" % "hbase-spark" % "1.0.0"

// https://mvnrepository.com/artifact/org.apache.hbase/hbase-client
libraryDependencies += "org.apache.hbase" % "hbase-client" % "2.1.5"

// https://mvnrepository.com/artifact/org.apache.hbase/hbase-server
libraryDependencies += "org.apache.hbase" % "hbase-server" % "2.1.1"

resolvers += "Hortonworks Repository" at "https://repo.hortonworks.com/content/repositories/releases/"

libraryDependencies ++= Seq(
"com.hortonworks" % "shc-core" % "1.1.1-2.1-s_2.11"
)


spark-submit --class com.spark.hbase.integration.Spark_hbase_integration --packages com.hortonworks:shc-core:1.1.1-2.1-s_2.11 --repositories http://repo.hortonworks.com/content/groups/public/ --files /etc/hbase/conf/hbase-site.xml /home/sasmitsb4081/untitled1_2.11-0.1.jar local

I have refered to https://github.com/hortonworks-spark/shc and i could run the same code in spark shell but
its failing while running through spark code.

Below is the spark code:

package com.spark.hbase.integration

import org.apache.hadoop.hbase.spark.datasources.HBaseTableCatalog
import org.apache.spark.sql.{SparkSession, _}


object Spark_hbase_integration {

def main(args: Array[String]): Unit = {
val spark = SparkSession

.builder()

.master("local")

.appName("Spark SQL basic example")

.config("spark.sql.warehouse.directory", "/user/hive/warehouse")

.getOrCreate()

def catalog =

s"""{

|"table":{"namespace":"default", "name":"video-creator"},

|"rowkey":"key",

|"columns":{

|"rowkey":{"cf":"rowkey", "col":"key", "type":"string"},

|"vidcrt":{"cf":"vidcrt", "col":"creator_id", "type":"string"}

|}

|}""".stripMargin

def withCatalog(cat: String😞 DataFrame = {

spark.sqlContext

.read

.options(Map(HBaseTableCatalog.tableCatalog->cat))

.format("org.apache.spark.sql.execution.datasources.hbase")

.load()

}

val df_video_creator = withCatalog(catalog)

df_video_creator.show()

}

}

Please note i am using spark version 2.1.1 and hbase version 1.1.2

 

0 REPLIES 0