Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark Hbase Connector NullPointerException

Solved Go to solution
Highlighted

Spark Hbase Connector NullPointerException

Explorer

Hi.

I'm trying to connect to HBase from Spark using this connector

https://github.com/hortonworks-spark/shc

This is my code:

def catalog = s"""{
                  |"table":{"namespace":"default", "name":"terminals"},
                  |"rowkey":"key",
                  |"columns":{
                  |"col0":{"cf":"rowkey", "col":"key", "type":"string"},
                  |"col1":{"cf":"tinfo", "col":"status", "type":"int"},
                  |"col2":{"cf":"tinfo", "col":"latitude", "type":"double"},
                  |"col2":{"cf":"tinfo", "col":"longitude", "type":"double"}
                  |}
                  |}""".stripMargin

def withCatalog(cat: String): DataFrame = {
  sqlContext
    .read
    .options(Map(HBaseTableCatalog.tableCatalog->cat))
    .format("org.apache.spark.sql.execution.datasources.hbase")
    .load()
}

val df = withCatalog(catalog)
df.show()
val dfFilter = df.filter($"col0".isin("1212121"))
parsed.join(dfFilter, parsed("terminal_id") === dfFilter("col0")).show()

but when I try to execute with spark-submit:

spark-submit --class com.location.userTransactionMain --master local[*] --files /etc/hbase/conf/hbase-site.xml userTransactionAppScala-assembly-1.0.jar

It returns an error:

Exception in thread "main" java.lang.NullPointerException: Please define 'tableCoder' in your catalog. If there is an Avro records/schema in your catalog, please explicitly define 'coder' in its corresponding column. at org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog$.apply(HBaseTableCatalog.scala:223) at org.apache.spark.sql.execution.datasources.hbase.HBaseRelation.<init>(HBaseRelation.scala:77) at org.apache.spark.sql.execution.datasources.hbase.DefaultSource.createRelation(HBaseRelation.scala:51) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)

Can someone help me?

Thanks!!!!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Spark Hbase Connector NullPointerException

Contributor

There is a "tableCoder" required in the table definition in catalog. Please refer to this:

https://github.com/hortonworks-spark/shc/blob/master/examples/src/main/scala/org/apache/spark/sql/ex...

3 REPLIES 3

Re: Spark Hbase Connector NullPointerException

Contributor

There is a "tableCoder" required in the table definition in catalog. Please refer to this:

https://github.com/hortonworks-spark/shc/blob/master/examples/src/main/scala/org/apache/spark/sql/ex...

Re: Spark Hbase Connector NullPointerException

New Contributor

Hi ,

Please copy hbase-site.xml file to /etc/spark/conf and retry ,let me know how it went.

And also refer below article

https://community.hortonworks.com/content/supportkb/48988/how-to-run-spark-job-to-interact-with-secu...

Thanks.

Re: Spark Hbase Connector NullPointerException

New Contributor

I guess your code is the old one. The latest code does not has this issue. Currently, SHC has the default table coder "Phoenix", but it has incompatibility issue. We are working on the PR#95 to fix it. In SHC we have release tags for each branch (e.g. Tag v1.0.1-2.0 for Spark v2.0 and v1.0.1-1.6 for Spark v1.6) that show the snapshots that should be used, as opposed to branch heads that might be unstable.

Don't have an account?
Coming from Hortonworks? Activate your account here