Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error while access hbase table via spark-shell and shc

Highlighted

Error while access hbase table via spark-shell and shc

New Contributor

Hi,

HDP 3.1.0.0-78, Spark 2.3.2.3.1.0.0-78

Build new SHC connector - https://github.com/hortonworks-spark/shc


Trying to get data from hbase table

[root@ds02 shc]# export SPARK_CLASSPATH=/etc/hbase/3.1.0.0-78/0/hbase-site.xml
[root@ds02 shc]# export SPARK_DIST_CLASSPATH=/usr/hdp/3.1.0.0-78/hbase/lib/*
[root@ds02 shc]# spark-shell --jars /tmp/hbase/shc/core/target/shc-core-1.1.3-2.4-s_2.11.jar
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/spark2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/phoenix/phoenix-5.0.0.3.1.0.0-78-server.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4044. Attempting port 4045.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4045. Attempting port 4046.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4046. Attempting port 4047.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4047. Attempting port 4048.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4048. Attempting port 4049.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4049. Attempting port 4050.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4050. Attempting port 4051.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4051. Attempting port 4052.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4052. Attempting port 4053.
19/07/02 15:31:56 WARN Utils: Service 'SparkUI' could not bind on port 4053. Attempting port 4054.
Spark context Web UI available at http://ds02.localdomain:4054
Spark context available as 'sc' (master = yarn, app id = application_1562047463581_0068).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.2.3.1.0.0-78
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information.

scala> import org.apache.spark.sql._;
import org.apache.spark.sql._

scala> import org.apache.spark.sql.datasources.hbase.HBaseTableCatalog;
import org.apache.spark.sql.datasources.hbase.HBaseTableCatalog

scala> val spark = SparkSession.builder().appName("HbaseTest").config("hbase.zookeeper.quorum","bi01.localdomain, bi02.localdomain, ds01.localdomain").config("hbase.zookeeper.property.clientPort","2181").config("zookeeper.znode.parent", "/hbase-unsecure").getOrCreate();
19/07/02 15:32:22 WARN SparkSession$Builder: Using an existing SparkSession; some configuration may not take effect.
spark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@3ab70d34

scala> val cat = s"""{

     |              |"table":{"namespace":"default", "name":"testTable"},
     |              |"rowkey":"key",
     |              |"columns":{
     |                  |"col0":{"cf":"rowkey", "col":"key", "type":"string"},
     |                  |"col1":{"cf":"test", "col":"test_column_name", "type":"string"},
     |                  |"ACCOUNT_TYPE":{"cf":"test", "col":"test_column_name2", "type":"string"}
     |              |}
     |              |}""".stripMargin
cat: String =
{
"table":{"namespace":"default", "name":"testTable"},
"rowkey":"key",
"columns":{
"col0":{"cf":"rowkey", "col":"key", "type":"string"},
"col1":{"cf":"test", "col":"test_column_name", "type":"string"},
"ACCOUNT_TYPE":{"cf":"test", "col":"test_column_name2", "type":"string"}

}
}

scala> var df = spark.read.options(Map(HBaseTableCatalog.tableCatalog->cat)).format("org.apache.spark.sql.execution.datasources.hbase").load()
df: org.apache.spark.sql.DataFrame = [col0: string, col1: string ... 1 more field]

scala> df.count()

And got error

[Stage 0:>                                                          (0 + 1) / 1]19/07/02 15:32:39 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ds01.localdomain, executor 1): java.io.InvalidClassException: org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD; local class incompatible: stream classdesc serialVersionUID = -1620527819929923458, local class serialVersionUID = 296721987802688695


Full output here - https://pastebin.com/raw/gHePRct2


Anybody help me? Thanks.