<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Using spark-hbase-connector Package with Pyspark in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/41445#M30030</link>
    <description>&lt;P&gt;Hi all, I wanted to experiment with the&amp;nbsp;"it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3" Package (you can find it at&amp;nbsp;&lt;A href="https://spark-packages.org/package/nerdammer/spark-hbase-connector" target="_blank"&gt;spark-packages.org&lt;/A&gt;&amp;nbsp;). It's an interesting addon giving RDD visibility/operativity on hBase tables via Spark.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I run this extension library in a standard spark-shell (with scala support), everything works smoothly&amp;nbsp;:&lt;/P&gt;&lt;PRE&gt;spark-shell --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \
--conf spark.hbase.host=&amp;lt;HBASE_HOST&amp;gt;

scala&amp;gt; import it.nerdammer.spark.hbase._
import it.nerdammer.spark.hbase._&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;If I try to run it in a Pyspark shell, therefore &lt;STRONG&gt;my goal is to use the extension with Python&lt;/STRONG&gt;, I'm not able to import the Functions and I'm not able to use anything:&lt;/P&gt;&lt;PRE&gt;PYSPARK_DRIVER_PYTHON=ipython pyspark --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \
--conf spark.hbase.host=&amp;lt;HBASE_HOST&amp;gt;

In [1]: from it.nerdammer.spark.hbase import *
---------------------------------------------------------------------------
ImportError &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Traceback (most recent call last)
&amp;lt;ipython-input-1-37dd5a5ffba0&amp;gt; in &amp;lt;module&amp;gt;()
----&amp;gt; 1 from it.nerdammer.spark.hbase import *

ImportError: No module named it.nerdammer.spark.hbase&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;I have tried different combinations of environment variables, parameters, etc when launching Pyspark, but to no avail.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Maybe I'm just trying to do something deeply wrong here, maybe it's simply the fact that there is no Python API access to this Library. In a matter of fact, the examples on the Package's home page are all in Scala (but they say you can install the Package in Pyspark too, with the classic "--package" parameter).&lt;BR /&gt;&lt;BR /&gt;Can anybody help out with the "ImportError: No module named it.nerdammer.spark.hbase" error message?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for any insight&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 10:22:14 GMT</pubDate>
    <dc:creator>FrozenWave</dc:creator>
    <dc:date>2022-09-16T10:22:14Z</dc:date>
    <item>
      <title>Using spark-hbase-connector Package with Pyspark</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/41445#M30030</link>
      <description>&lt;P&gt;Hi all, I wanted to experiment with the&amp;nbsp;"it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3" Package (you can find it at&amp;nbsp;&lt;A href="https://spark-packages.org/package/nerdammer/spark-hbase-connector" target="_blank"&gt;spark-packages.org&lt;/A&gt;&amp;nbsp;). It's an interesting addon giving RDD visibility/operativity on hBase tables via Spark.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I run this extension library in a standard spark-shell (with scala support), everything works smoothly&amp;nbsp;:&lt;/P&gt;&lt;PRE&gt;spark-shell --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \
--conf spark.hbase.host=&amp;lt;HBASE_HOST&amp;gt;

scala&amp;gt; import it.nerdammer.spark.hbase._
import it.nerdammer.spark.hbase._&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;If I try to run it in a Pyspark shell, therefore &lt;STRONG&gt;my goal is to use the extension with Python&lt;/STRONG&gt;, I'm not able to import the Functions and I'm not able to use anything:&lt;/P&gt;&lt;PRE&gt;PYSPARK_DRIVER_PYTHON=ipython pyspark --packages it.nerdammer.bigdata:spark-hbase-connector_2.10:1.0.3 \
--conf spark.hbase.host=&amp;lt;HBASE_HOST&amp;gt;

In [1]: from it.nerdammer.spark.hbase import *
---------------------------------------------------------------------------
ImportError &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Traceback (most recent call last)
&amp;lt;ipython-input-1-37dd5a5ffba0&amp;gt; in &amp;lt;module&amp;gt;()
----&amp;gt; 1 from it.nerdammer.spark.hbase import *

ImportError: No module named it.nerdammer.spark.hbase&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;I have tried different combinations of environment variables, parameters, etc when launching Pyspark, but to no avail.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Maybe I'm just trying to do something deeply wrong here, maybe it's simply the fact that there is no Python API access to this Library. In a matter of fact, the examples on the Package's home page are all in Scala (but they say you can install the Package in Pyspark too, with the classic "--package" parameter).&lt;BR /&gt;&lt;BR /&gt;Can anybody help out with the "ImportError: No module named it.nerdammer.spark.hbase" error message?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for any insight&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:22:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/41445#M30030</guid>
      <dc:creator>FrozenWave</dc:creator>
      <dc:date>2022-09-16T10:22:14Z</dc:date>
    </item>
    <item>
      <title>Re: Using spark-hbase-connector Package with Pyspark</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/43263#M30031</link>
      <description>Here's one example that uses the native hbase-spark module via DataFrames in PySpark: &lt;A href="http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Include-latest-hbase-spark-in-CDH/m-p/43236/highlight/true#M2280" target="_blank"&gt;http://community.cloudera.com/t5/Storage-Random-Access-HDFS/Include-latest-hbase-spark-in-CDH/m-p/43236/highlight/true#M2280&lt;/A&gt;</description>
      <pubDate>Wed, 27 Jul 2016 13:37:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/43263#M30031</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-07-27T13:37:54Z</dc:date>
    </item>
    <item>
      <title>Re: Using spark-hbase-connector Package with Pyspark</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/43315#M30032</link>
      <description>&lt;P&gt;Thanks. Seems a good alternative, and in a matter of fact I was not aware of its availability in CDH 5.7&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Marking the thread as solved, even if by now I don't know yet if all the features I'd need will be there in the native hbase-spark connector&lt;/P&gt;</description>
      <pubDate>Thu, 28 Jul 2016 10:11:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Using-spark-hbase-connector-Package-with-Pyspark/m-p/43315#M30032</guid>
      <dc:creator>FrozenWave</dc:creator>
      <dc:date>2016-07-28T10:11:13Z</dc:date>
    </item>
  </channel>
</rss>

