<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: What's the best practice to get data from hbase and form dataframe for Python/R? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99381#M12603</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1069/cuilin.html" nodeid="1069"&gt;@Cui Lin&lt;/A&gt; I updated my response above with links to mapreduce examples. You will need to setup a scanner based on your criteria and then run mapreduce to write out the data to files for Pig, here's an &lt;A href="https://github.com/dbist/pig/blob/master/hbase/count.pig"&gt;example&lt;/A&gt; to read data from Hbase table, then you just call "store data into 'location' using storage of your choice.&lt;/P&gt;</description>
    <pubDate>Sat, 09 Jan 2016 02:34:51 GMT</pubDate>
    <dc:creator>aervits</dc:creator>
    <dc:date>2016-01-09T02:34:51Z</dc:date>
    <item>
      <title>What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99376#M12598</link>
      <description>&lt;P&gt;What's the best practice to get data from hbase and form dataframe for Python/R? If we want to use our Panda/R libraries, how to get data from hbase and form dataframe automatically?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Dec 2015 04:00:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99376#M12598</guid>
      <dc:creator>cui_lin</dc:creator>
      <dc:date>2015-12-16T04:00:34Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99377#M12599</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1069/cuilin.html" nodeid="1069"&gt;@Cui Lin&lt;/A&gt; I am not R guy but this would give you a good starting point depending on you want to use RevR, R or Python.&lt;/P&gt;&lt;P&gt;RHbase tutorials --&amp;gt;&lt;/P&gt;&lt;P&gt; &lt;A href="https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Erhbase%3EHome"&gt;https://github.com/RevolutionAnalytics/RHadoop/wik...&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;A href="http://www.odbms.org/2015/06/intro-to-hbase-via-r-a-tutorial/"&gt;http://www.odbms.org/2015/06/intro-to-hbase-via-r-...&lt;/A&gt; &lt;/P&gt;&lt;P&gt;&lt;A href="http://radar.oreilly.com/2014/08/scaling-up-data-frames.html"&gt;http://radar.oreilly.com/2014/08/scaling-up-data-f...&lt;/A&gt;&lt;/P&gt;&lt;P&gt;PandaHbase --&amp;gt; &lt;/P&gt;&lt;P&gt;&lt;A href="https://github.com/livingstonese/pandas-hbase"&gt;https://github.com/livingstonese/pandas-hbase&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Dec 2015 01:46:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99377#M12599</guid>
      <dc:creator>pardeep_kumar</dc:creator>
      <dc:date>2015-12-17T01:46:29Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99378#M12600</link>
      <description>&lt;P&gt;It seems that the above can't satisfy all my need. What's the best way to get data out of hbase and save into files instead?&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:21:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99378#M12600</guid>
      <dc:creator>cui_lin</dc:creator>
      <dc:date>2016-01-09T02:21:46Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99379#M12601</link>
      <description>&lt;P&gt;you can write &lt;A href="https://hbase.apache.org/book.html#mapreduce"&gt;Mapreduce&lt;/A&gt; program to dump data to files, you can use pig, you can use python with &lt;A href="https://happybase.readthedocs.org/en/latest/"&gt;happybase&lt;/A&gt;, you have a lot of different options &lt;A rel="user" href="https://community.cloudera.com/users/1069/cuilin.html" nodeid="1069"&gt;@Cui Lin&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:25:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99379#M12601</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-09T02:25:50Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99380#M12602</link>
      <description>&lt;P&gt;I need to firstly run query to select records based on time, and then dump data into files or data frame. Happybase can't support query and its index has to be integer. Could you lead me some example on mapreduce or pig example?&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:30:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99380#M12602</guid>
      <dc:creator>cui_lin</dc:creator>
      <dc:date>2016-01-09T02:30:37Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99381#M12603</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1069/cuilin.html" nodeid="1069"&gt;@Cui Lin&lt;/A&gt; I updated my response above with links to mapreduce examples. You will need to setup a scanner based on your criteria and then run mapreduce to write out the data to files for Pig, here's an &lt;A href="https://github.com/dbist/pig/blob/master/hbase/count.pig"&gt;example&lt;/A&gt; to read data from Hbase table, then you just call "store data into 'location' using storage of your choice.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:34:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99381#M12603</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-09T02:34:51Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99382#M12604</link>
      <description>&lt;P&gt;Is there any example to get data from Hbase using Spark in Hortonworks? MapR and Cloudera has some packages like this, not sure if it could work in Hortonworks.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:35:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99382#M12604</guid>
      <dc:creator>cui_lin</dc:creator>
      <dc:date>2016-01-09T02:35:51Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99383#M12605</link>
      <description>&lt;P&gt;there's work in progress to make Spark and HBase work efficiently together on the Hortonworks Side, we're not publishing anything until we can support it.&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:38:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99383#M12605</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-09T02:38:59Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99384#M12606</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/372/enis.html" nodeid="372"&gt;@Enis&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/332/vshukla.html" nodeid="332"&gt;@vshukla&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/412/ddas.html" nodeid="412"&gt;@Devaraj Das&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/528/rsriharsha.html" nodeid="528"&gt;@Ram Sriharsha&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 02:40:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99384#M12606</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-01-09T02:40:27Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99385#M12607</link>
      <description>&lt;P&gt;We have an experimental Spark HBase connector, &lt;A href="https://github.com/zhzhan/shc" target="_blank"&gt;https://github.com/zhzhan/shc&lt;/A&gt;&lt;/P&gt;&lt;P&gt;With the following features&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;First class support for DataFrame API&lt;/LI&gt;&lt;LI&gt;JSON based catalog with rich data type support&lt;/LI&gt;&lt;LI&gt;Performant, scalable, enterprise-ready&lt;/LI&gt;&lt;LI&gt;Partition Pruning&lt;/LI&gt;&lt;LI&gt;Predicate Pushdown&lt;/LI&gt;&lt;LI&gt;Scan optimizations&lt;/LI&gt;&lt;LI&gt;Data Locality&lt;/LI&gt;&lt;LI&gt;Composite Rowkey&lt;/LI&gt;&lt;LI&gt;Leverage existing work in the HBase community&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Please take look at the README of the above project.&lt;/P&gt;&lt;P&gt;Also see example &lt;A href="https://github.com/zhzhan/shc/blob/master/src/main/scala/org/apache/spark/sql/execution/datasources/hbase/examples/HBaseSource.scala" target="_blank"&gt;https://github.com/zhzhan/shc/blob/master/src/main/scala/org/apache/spark/sql/execution/datasources/hbase/examples/HBaseSource.scala&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jan 2016 03:04:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99385#M12607</guid>
      <dc:creator>vshukla</dc:creator>
      <dc:date>2016-01-09T03:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99386#M12608</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/393/aervits.html" nodeid="393"&gt;@Artem Ervits&lt;/A&gt;, Is there any progress on the Spark on HBase by Hortonworks. We are using the HDP platform but I am not able to easily conclude from the internet that confirms there is progress beyond the above discussion in 2016. &lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2017 13:29:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99386#M12608</guid>
      <dc:creator>sai_geetha</dc:creator>
      <dc:date>2017-12-28T13:29:20Z</dc:date>
    </item>
    <item>
      <title>Re: What's the best practice to get data from hbase and form dataframe for Python/R?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99387#M12609</link>
      <description>&lt;P&gt; &lt;A rel="user" href="https://community.cloudera.com/users/45487/saigeetha.html" nodeid="45487"&gt;@Sai Geetha M N&lt;/A&gt; please read our latest docs &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch_introduction-spark.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch_introduction-spark.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;and &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch08s05.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_spark-component-guide/content/ch08s05.html&lt;/A&gt; it's been out for a while now.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Jan 2018 00:38:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-s-the-best-practice-to-get-data-from-hbase-and-form/m-p/99387#M12609</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2018-01-26T00:38:26Z</dc:date>
    </item>
  </channel>
</rss>

