<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Java Spark Program and Hive table backed by HBase table in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132189#M47678</link>
    <description>&lt;P&gt;The solution was:&lt;/P&gt;&lt;P&gt;The Spark provide a sample HBase test program in /usr/hdp/current/spark-client/examples/src/main/scala/org/apache/spark/examples.&lt;/P&gt;&lt;P&gt;The program name is HBaseTest.scala. If you open this file, you will see the comment:&lt;/P&gt;&lt;P&gt;    // please ensure HBASE_CONF_DIR is on classpath of spark driver &lt;/P&gt;&lt;P&gt;    // e.g: set it through spark.driver.extraClassPath property &lt;/P&gt;&lt;P&gt;    // in spark-defaults.conf or through --driver-class-path &lt;/P&gt;&lt;P&gt;    // command line option of spark-submit
&lt;/P&gt;&lt;P&gt;So, I added that parameter and my command line becomes as follows:&lt;/P&gt;&lt;P&gt;spark-submit --jars hive-hbase-handler.jar,hbase-client.jar,hbase-common.jar,hbase-hadoop-compact.jar,hbase-hadoop2-compact.jar,hbase-protocol.jar,hbase-server.jar,metrics-core.jar,guava.jar --driver-class-path postgresql.jar --master yarn-client --files /usr/hdp/current/hbase-client/conf/hbase-site.xml --class SparkJS --driver-class-path /etc/hbase/2.5.0.0-1245/0  spark-js-1.jar&lt;/P&gt;&lt;P&gt;The issue is gone and I can do what I need to do.&lt;/P&gt;</description>
    <pubDate>Thu, 08 Dec 2016 08:33:25 GMT</pubDate>
    <dc:creator>shigeru_takehar</dc:creator>
    <dc:date>2016-12-08T08:33:25Z</dc:date>
    <item>
      <title>Java Spark Program and Hive table backed by HBase table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132186#M47675</link>
      <description>&lt;P&gt;I have a Hive table that is integrated with HBase table. It works fine on Hive command line to see data; however, when I try to do the same in Spark Java code where create a dataframe object by select statement and call show method, I see the following message forever:&lt;/P&gt;&lt;P&gt;16/11/30 19:40:31 INFO ClientCnxn: Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x15802d56675006a, negotiated timeout = 90000 &lt;/P&gt;&lt;P&gt;16/11/30 19:40:31 INFO RegionSizeCalculator: Calculating region sizes for table "st_tbl_1". &lt;/P&gt;&lt;P&gt;16/11/30 19:41:19 INFO RpcRetryingCaller: Call exception, tries=10, retries=35, started=48332 ms ago, cancelled=false, msg= &lt;/P&gt;&lt;P&gt;16/11/30 19:41:40 INFO RpcRetryingCaller: Call exception, tries=11, retries=35, started=68473 ms ago, cancelled=false, msg= &lt;/P&gt;&lt;P&gt;16/11/30 19:42:00 INFO RpcRetryingCaller: Call exception, tries=12, retries=35, started=88545 ms ago, cancelled=false, msg= &lt;/P&gt;&lt;P&gt;16/11/30 19:42:20 INFO RpcRetryingCaller: Call exception, tries=13, retries=35, started=108742 ms ago, cancelled=false, msg=&lt;/P&gt;</description>
      <pubDate>Thu, 01 Dec 2016 08:53:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132186#M47675</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-12-01T08:53:52Z</dc:date>
    </item>
    <item>
      <title>Re: Java Spark Program and Hive table backed by HBase table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132187#M47676</link>
      <description>&lt;P&gt;I typically don't recommend using Hive atop HBase. The performance is terrible when you start getting into high data volumes. You could create your HBase tables and then use Spark to access data programmatically using the Data Frames API, and use Phoenix to create a view atop HBase for SQL analytics. Phoenix is orders of magnitude faster than Hive for SQL on HBase. Try it out. It's easy to use. You use Phoenix to create a schema for the HBase table. &lt;/P&gt;&lt;P&gt;Hortonworks released a Spark on HBase connector that you can use. It's a DataFrame based connector:&lt;/P&gt;&lt;P&gt;&lt;A href="http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/" target="_blank"&gt;http://hortonworks.com/blog/spark-hbase-dataframe-based-hbase-connector/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Dec 2016 10:50:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132187#M47676</guid>
      <dc:creator>bmathew</dc:creator>
      <dc:date>2016-12-01T10:50:50Z</dc:date>
    </item>
    <item>
      <title>Re: Java Spark Program and Hive table backed by HBase table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132188#M47677</link>
      <description>&lt;P&gt;Thank you for the recommendation, but I would like to solve this issue first.&lt;/P&gt;&lt;P&gt;We are using hdp 2.5. Previously, we used hdp 2.3 where I could not run Spark with  Phoenix. Can hdp 2.5 allow us to use Phoenix in Spark 1.6.2?&lt;/P&gt;</description>
      <pubDate>Thu, 01 Dec 2016 13:19:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132188#M47677</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-12-01T13:19:32Z</dc:date>
    </item>
    <item>
      <title>Re: Java Spark Program and Hive table backed by HBase table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132189#M47678</link>
      <description>&lt;P&gt;The solution was:&lt;/P&gt;&lt;P&gt;The Spark provide a sample HBase test program in /usr/hdp/current/spark-client/examples/src/main/scala/org/apache/spark/examples.&lt;/P&gt;&lt;P&gt;The program name is HBaseTest.scala. If you open this file, you will see the comment:&lt;/P&gt;&lt;P&gt;    // please ensure HBASE_CONF_DIR is on classpath of spark driver &lt;/P&gt;&lt;P&gt;    // e.g: set it through spark.driver.extraClassPath property &lt;/P&gt;&lt;P&gt;    // in spark-defaults.conf or through --driver-class-path &lt;/P&gt;&lt;P&gt;    // command line option of spark-submit
&lt;/P&gt;&lt;P&gt;So, I added that parameter and my command line becomes as follows:&lt;/P&gt;&lt;P&gt;spark-submit --jars hive-hbase-handler.jar,hbase-client.jar,hbase-common.jar,hbase-hadoop-compact.jar,hbase-hadoop2-compact.jar,hbase-protocol.jar,hbase-server.jar,metrics-core.jar,guava.jar --driver-class-path postgresql.jar --master yarn-client --files /usr/hdp/current/hbase-client/conf/hbase-site.xml --class SparkJS --driver-class-path /etc/hbase/2.5.0.0-1245/0  spark-js-1.jar&lt;/P&gt;&lt;P&gt;The issue is gone and I can do what I need to do.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2016 08:33:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132189#M47678</guid>
      <dc:creator>shigeru_takehar</dc:creator>
      <dc:date>2016-12-08T08:33:25Z</dc:date>
    </item>
    <item>
      <title>Re: Java Spark Program and Hive table backed by HBase table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132190#M47679</link>
      <description>&lt;P&gt;To confirm, the issue is that hbase conf was not available to spark. You can also check the Spark HBase Connector we support at &lt;A href="https://github.com/hortonworks-spark/shc" target="_blank"&gt;https://github.com/hortonworks-spark/shc&lt;/A&gt;. It has many features but also documents the configuration for Spark Hbase access and security aspects too.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Dec 2016 10:47:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Java-Spark-Program-and-Hive-table-backed-by-HBase-table/m-p/132190#M47679</guid>
      <dc:creator>bikas</dc:creator>
      <dc:date>2016-12-09T10:47:20Z</dc:date>
    </item>
  </channel>
</rss>

