<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Spark SQL as a Federated DB in Production? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147734#M23913</link>
    <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/24935/spark-sql-as-a-federated-db-in-production.html#"&gt;@azeltov&lt;/A&gt; How large is the implementation (How many tables, how much is cached vs read through to source)? In the case where they are caching the entire table, how do they ensure the data is not stale? Are the tables temporary or saved in the hive meta store?&lt;/P&gt;</description>
    <pubDate>Tue, 29 Mar 2016 23:27:17 GMT</pubDate>
    <dc:creator>vvaks</dc:creator>
    <dc:date>2016-03-29T23:27:17Z</dc:date>
    <item>
      <title>Spark SQL as a Federated DB in Production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147732#M23911</link>
      <description>&lt;P&gt;There are some great articles and threads here on HCC about using Spark to query data from other JDBC sources and mash them up with anything else you can get into an RDD. Has anyone seen this pattern (Spark as a Federated DB including JDBC sources) actually used in Production (with JDBC thrift server)? What is the right configuration within a secure, multi-tenant Hadoop cluster?&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 19:44:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147732#M23911</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-29T19:44:44Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL as a Federated DB in Production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147733#M23912</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3656/vvaks.html" nodeid="3656"&gt;@Vadim&lt;/A&gt; I have seen SparkSQL used in production for pulling RDBMS data. I have not seen it yet using kerberos environment, will follow thread for secure configurations.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 23:14:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147733#M23912</guid>
      <dc:creator>azeltov</dc:creator>
      <dc:date>2016-03-29T23:14:30Z</dc:date>
    </item>
    <item>
      <title>Re: Spark SQL as a Federated DB in Production?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147734#M23913</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/24935/spark-sql-as-a-federated-db-in-production.html#"&gt;@azeltov&lt;/A&gt; How large is the implementation (How many tables, how much is cached vs read through to source)? In the case where they are caching the entire table, how do they ensure the data is not stale? Are the tables temporary or saved in the hive meta store?&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 23:27:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Spark-SQL-as-a-Federated-DB-in-Production/m-p/147734#M23913</guid>
      <dc:creator>vvaks</dc:creator>
      <dc:date>2016-03-29T23:27:17Z</dc:date>
    </item>
  </channel>
</rss>

