<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Reading from and Writing to HBase with a spark DataFrame in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146202#M19929</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/2311/davidtam.html" nodeid="2311"&gt;@David Tam&lt;/A&gt;, for a working example using phoenix-spark to read/write HBase DataFrames, checkout &lt;A href="https://github.com/randerzander/HiveToPhoenix" target="_blank"&gt;https://github.com/randerzander/HiveToPhoenix&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Feb 2016 05:32:42 GMT</pubDate>
    <dc:creator>rgelhausen</dc:creator>
    <dc:date>2016-02-18T05:32:42Z</dc:date>
    <item>
      <title>Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146198#M19925</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am recently tasked to work out something that can read data from HBase into a Spark DataFrame and also once the transformation / enrichment is done write the DataFrame back into HBase.&lt;/P&gt;&lt;P&gt;What is the best way of doing this?  I can see from Cloudera there is sparkOnHBase package (but I think they have given the code to HBase, and the maven modules are with version 0.0.x-clabs-SNAPSHOT which doesnt sound assuring..).  There is also a HBase-Spark module on apache HBase but it seems that it is not released yet.&lt;/P&gt;&lt;P&gt;Ideally it would be something similar to these:&lt;/P&gt;&lt;PRE&gt;// using spark-csv from databricks
DataFrame csvDF = sqlContext.read()
        .format("csv")
        .options(options)
        .load(hdfs.getURI("hdfs://sandbox:8020"));

// using spark-solr from lucidworks
DataFrame solrDF = sqlContext.read()
        .format("solr")
        .options(options)
        .load();
&lt;/PRE&gt;&lt;P&gt;Is there something similar to these in the HBase world?&lt;/P&gt;&lt;P&gt;I have also seen &lt;A href="https://community.hortonworks.com/questions/6585/whats-the-best-practice-to-get-data-from-hbase-and.html"&gt;this thread with the experimental connector&lt;/A&gt; but I would really prefer something more mature.&lt;/P&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 00:00:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146198#M19925</guid>
      <dc:creator>David_Tam</dc:creator>
      <dc:date>2016-02-18T00:00:29Z</dc:date>
    </item>
    <item>
      <title>Re: Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146199#M19926</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2311/davidtam.html" nodeid="2311"&gt;@David Tam&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;right now the only definite answer is &lt;A href="https://phoenix.apache.org/phoenix_spark.html" target="_blank"&gt;https://phoenix.apache.org/phoenix_spark.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;HBase-Spark is not released yet and it's coming very soon, no timeline was announced yet.&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 00:12:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146199#M19926</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-18T00:12:29Z</dc:date>
    </item>
    <item>
      <title>Re: Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146200#M19927</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/2311/davidtam.html" nodeid="2311"&gt;@David Tam&lt;/A&gt;&lt;P&gt;See this jira &lt;A href="https://issues.apache.org/jira/browse/HBASE-13992" target="_blank"&gt;https://issues.apache.org/jira/browse/HBASE-13992&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 00:23:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146200#M19927</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-18T00:23:57Z</dc:date>
    </item>
    <item>
      <title>Re: Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146201#M19928</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2311/davidtam.html" nodeid="2311"&gt;@David Tam&lt;/A&gt;  Amazing to see all the jira on the same topic &lt;A href="https://issues.apache.org/jira/browse/HBASE-14181"&gt;https://issues.apache.org/jira/browse/HBASE-14181&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A target="_blank" href="https://www.google.com/search?q=hbase+spark+jira&amp;amp;rlz=1C5CHFA_enUS548US548&amp;amp;oq=hbase+spark+jira&amp;amp;aqs=chrome..69i57j0.2199j1j4&amp;amp;sourceid=chrome&amp;amp;es_sm=119&amp;amp;ie=UTF-8"&gt;Link&lt;/A&gt;
&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 00:25:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146201#M19928</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-18T00:25:17Z</dc:date>
    </item>
    <item>
      <title>Re: Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146202#M19929</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/2311/davidtam.html" nodeid="2311"&gt;@David Tam&lt;/A&gt;, for a working example using phoenix-spark to read/write HBase DataFrames, checkout &lt;A href="https://github.com/randerzander/HiveToPhoenix" target="_blank"&gt;https://github.com/randerzander/HiveToPhoenix&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 05:32:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146202#M19929</guid>
      <dc:creator>rgelhausen</dc:creator>
      <dc:date>2016-02-18T05:32:42Z</dc:date>
    </item>
    <item>
      <title>Re: Reading from and Writing to HBase with a spark DataFrame</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146203#M19930</link>
      <description>&lt;P&gt;Thanks all for the input.  The phoenix-spark example looks very close to what we need but I am not sure if people in my team would be happy with phoenix but I will bring this up and see.  Meanwhile I think I will also follow the HBase jira and hope that it will be out soon.&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Thu, 18 Feb 2016 16:28:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Reading-from-and-Writing-to-HBase-with-a-spark-DataFrame/m-p/146203#M19930</guid>
      <dc:creator>David_Tam</dc:creator>
      <dc:date>2016-02-18T16:28:40Z</dc:date>
    </item>
  </channel>
</rss>

