Support Questions

Find answers, ask questions, and share your expertise

Using spark-hbase-connector Package with Pyspark

Rising Star

Hi all, I wanted to experiment with the "it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3" Package (you can find it at ). It's an interesting addon giving RDD visibility/operativity on hBase tables via Spark.


If I run this extension library in a standard spark-shell (with scala support), everything works smoothly :

spark-shell --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \

scala> import it.nerdammer.spark.hbase._
import it.nerdammer.spark.hbase._

If I try to run it in a Pyspark shell, therefore my goal is to use the extension with Python, I'm not able to import the Functions and I'm not able to use anything:

PYSPARK_DRIVER_PYTHON=ipython pyspark --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \

In [1]: from it.nerdammer.spark.hbase import *
ImportError                               Traceback (most recent call last)
<ipython-input-1-37dd5a5ffba0> in <module>()
----> 1 from it.nerdammer.spark.hbase import *

ImportError: No module named it.nerdammer.spark.hbase

I have tried different combinations of environment variables, parameters, etc when launching Pyspark, but to no avail.


Maybe I'm just trying to do something deeply wrong here, maybe it's simply the fact that there is no Python API access to this Library. In a matter of fact, the examples on the Package's home page are all in Scala (but they say you can install the Package in Pyspark too, with the classic "--package" parameter).

Can anybody help out with the "ImportError: No module named it.nerdammer.spark.hbase" error message?


Thanks for any insight


Master Guru
Here's one example that uses the native hbase-spark module via DataFrames in PySpark:

View solution in original post


Master Guru
Here's one example that uses the native hbase-spark module via DataFrames in PySpark:

Rising Star

Thanks. Seems a good alternative, and in a matter of fact I was not aware of its availability in CDH 5.7


Marking the thread as solved, even if by now I don't know yet if all the features I'd need will be there in the native hbase-spark connector

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.