<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: List hbase tables Spark sql in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140402#M52179</link>
    <description>&lt;P&gt;Hive and HiveContext in Spark can only show the tables that are registered in the Hive Metastore and Hbase tables are usually not there because the schema of most Hbase tables are not easily defined in the metastore. &lt;/P&gt;&lt;P&gt;To read HBase tables from Spark using DataFrame API please consider &lt;A target="_blank" href="https://github.com/hortonworks-spark/shc"&gt;Spark HBase Connector&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 23 Jan 2017 09:33:46 GMT</pubDate>
    <dc:creator>bikas</dc:creator>
    <dc:date>2017-01-23T09:33:46Z</dc:date>
    <item>
      <title>List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140400#M52177</link>
      <description>&lt;P&gt;I would like to list Hbase tables using Spark SQL.&lt;/P&gt;&lt;P&gt;Tried below code, but its not working. Do we need to set hbase host, zookeeper quorum etc details in the Spark sql context options?&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val sparkConf = new SparkConf().setAppName("test") &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val sc= new SparkContext(sparkConf)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val sqlContext = new SQLContext(sc) &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val hiveContext = new HiveContext(sqlContext) &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val listOfTables = hiveContext.sql("list") &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;listOfTables.show&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 22 Jan 2017 14:38:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140400#M52177</guid>
      <dc:creator>sancar_sn</dc:creator>
      <dc:date>2017-01-22T14:38:33Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140401#M52178</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/13783/sancarsn.html" nodeid="13783"&gt;@Sankaraiah Narayanasamy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You can't list Hbase tables using Spark SQL because Hbase tables do not have a schema. Each row can have a different number of columns and each column is stored as a byte array not a specific data types. HiveContext will only allow you to list tables in Hive not Hbase. If you have Apache Phoenix installed over the top of Hbase, it is possible to see a list of tables, but not using HiveContext. &lt;/P&gt;&lt;P&gt;If you are trying to see a list of Hive Tables that SparkSQL can access, then the command is "show tables" not "list". So your code should be.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;val listOfTables = hiveContext.sql("show tables")&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;This will work assuming that you have Spark configured to point at the Hive Metastore.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2017 07:51:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140401#M52178</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2017-01-23T07:51:28Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140402#M52179</link>
      <description>&lt;P&gt;Hive and HiveContext in Spark can only show the tables that are registered in the Hive Metastore and Hbase tables are usually not there because the schema of most Hbase tables are not easily defined in the metastore. &lt;/P&gt;&lt;P&gt;To read HBase tables from Spark using DataFrame API please consider &lt;A target="_blank" href="https://github.com/hortonworks-spark/shc"&gt;Spark HBase Connector&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2017 09:33:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140402#M52179</guid>
      <dc:creator>bikas</dc:creator>
      <dc:date>2017-01-23T09:33:46Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140403#M52180</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/462/bikas.html" nodeid="462"&gt;@Bikas&lt;/A&gt; &lt;/P&gt;&lt;P&gt;We are actually using HortonWorks Hbase connector, But i cannot use this API to list tables, this is just for one POC , which we are trying to list Hbase tables.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2017 11:53:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140403#M52180</guid>
      <dc:creator>sancar_sn</dc:creator>
      <dc:date>2017-01-23T11:53:50Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140404#M52181</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3656/vvaks.html" nodeid="3656"&gt;@Vadim Vaks&lt;/A&gt;:&lt;/P&gt;&lt;P&gt;Thanks for the answer, so we cannot list the Hbase tables using Spark SQL Context.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2017 11:55:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140404#M52181</guid>
      <dc:creator>sancar_sn</dc:creator>
      <dc:date>2017-01-23T11:55:01Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140405#M52182</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/13783/sancarsn.html" nodeid="13783"&gt;@Sankaraiah Narayanasamy&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Not unless you create a Hive table using an Hbase storage handler:&lt;/P&gt;&lt;P&gt;&lt;A href="https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration"&gt;https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration&lt;/A&gt;&lt;/P&gt;&lt;P&gt;This will impose a schema onto an Hbase table through Hive and save the schema in the metastore. Once it's in the metastore, you can access it through HiveContext.&lt;/P&gt;&lt;P&gt;Or if you have Phoenix installed and you create a table through Phoenix, it will create am Hbase table as well as a schema catalog table. You can do a direct JDBC connection to Phoenix just like you would connect to mysql or postgres. You just need to use the Phoenix JDBC driver. You can then use meta data getters on the JDBC connection object to get the tables in the Phoenix. &lt;/P&gt;&lt;P&gt;Once you know the table you want to go after&lt;/P&gt;&lt;P&gt;import org.apache.phoenix.spark._&lt;/P&gt;&lt;P&gt;val df = sqlContext.load("org.apache.phoenix.spark", Map("table"-&amp;gt;"phoenix_table","zkUrl"-&amp;gt;"localhost:2181:/hbase-unsecure"))&lt;/P&gt;&lt;P&gt;df.show&lt;/P&gt;&lt;P&gt;This way, Spark will load data using executors in parallel. Now just use the Data Frame with the SQL context like normal.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jan 2017 14:28:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140405#M52182</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2017-01-23T14:28:24Z</dc:date>
    </item>
    <item>
      <title>Re: List hbase tables Spark sql</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140406#M52183</link>
      <description>&lt;P&gt;SHC does not have a notion of listing tables in HBase. It works on the table catalog provided to the data source in the program. Hive will also not list HBase tables because they are not present in the metastore. There is some rudimentary way to add Hbase external tables in Hive but I dont think that really used. I could be wrong.&lt;/P&gt;&lt;P&gt;To list Hbase tables, currently the only reliable way would be to use HBase API's inside the spark program to list tables.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jan 2017 03:45:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/List-hbase-tables-Spark-sql/m-p/140406#M52183</guid>
      <dc:creator>bikas</dc:creator>
      <dc:date>2017-01-24T03:45:44Z</dc:date>
    </item>
  </channel>
</rss>

