<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Why are T-SQL queries from MS SQL Server 2016 via  PolyBase on data in HDFS so slow? Doesn't PolyBase convert T-SQL to MapReduce? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-are-T-SQL-queries-from-MS-SQL-Server-2016-via-PolyBase/m-p/174333#M46135</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12546/claymcdonald.html" nodeid="12546"&gt;@Clay McDonald&lt;/A&gt;. One of the reason is because of MapReduce. Hive uses Tez but Polybase is not compatible yet with Tez. MapReduce is a batch data processing engine. You will also want to make sure your Hive tables are properly configured using best practices. Try implementing some of these rules where applicable &lt;A href="http://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/" target="_blank"&gt;http://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/&lt;/A&gt;. &lt;/P&gt;&lt;P&gt;Also be aware of your cluster size. MapReduce (as well as other data processing engines) use parallel processing but if you don't have many nodes than you are taking advantage of the design. &lt;/P&gt;&lt;P&gt;Note sure if its applicable in your case but you could use multiple SQL Servers to parallelize your Polybase query. &lt;A href="https://msdn.microsoft.com/en-us/library/mt607030.aspx" target="_blank"&gt;https://msdn.microsoft.com/en-us/library/mt607030.aspx&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 15 Nov 2016 00:45:20 GMT</pubDate>
    <dc:creator>SQLShaw</dc:creator>
    <dc:date>2016-11-15T00:45:20Z</dc:date>
    <item>
      <title>Why are T-SQL queries from MS SQL Server 2016 via  PolyBase on data in HDFS so slow? Doesn't PolyBase convert T-SQL to MapReduce?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-are-T-SQL-queries-from-MS-SQL-Server-2016-via-PolyBase/m-p/174332#M46134</link>
      <description>&lt;P&gt;Cannot find any deep learning documentation on how exactly PolyBase works and nothing on performance tuning. Any help would be appreicated.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Nov 2016 00:24:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-are-T-SQL-queries-from-MS-SQL-Server-2016-via-PolyBase/m-p/174332#M46134</guid>
      <dc:creator>clay_mcdonald</dc:creator>
      <dc:date>2016-11-15T00:24:15Z</dc:date>
    </item>
    <item>
      <title>Re: Why are T-SQL queries from MS SQL Server 2016 via  PolyBase on data in HDFS so slow? Doesn't PolyBase convert T-SQL to MapReduce?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-are-T-SQL-queries-from-MS-SQL-Server-2016-via-PolyBase/m-p/174333#M46135</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12546/claymcdonald.html" nodeid="12546"&gt;@Clay McDonald&lt;/A&gt;. One of the reason is because of MapReduce. Hive uses Tez but Polybase is not compatible yet with Tez. MapReduce is a batch data processing engine. You will also want to make sure your Hive tables are properly configured using best practices. Try implementing some of these rules where applicable &lt;A href="http://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/" target="_blank"&gt;http://hortonworks.com/blog/5-ways-make-hive-queries-run-faster/&lt;/A&gt;. &lt;/P&gt;&lt;P&gt;Also be aware of your cluster size. MapReduce (as well as other data processing engines) use parallel processing but if you don't have many nodes than you are taking advantage of the design. &lt;/P&gt;&lt;P&gt;Note sure if its applicable in your case but you could use multiple SQL Servers to parallelize your Polybase query. &lt;A href="https://msdn.microsoft.com/en-us/library/mt607030.aspx" target="_blank"&gt;https://msdn.microsoft.com/en-us/library/mt607030.aspx&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Nov 2016 00:45:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Why-are-T-SQL-queries-from-MS-SQL-Server-2016-via-PolyBase/m-p/174333#M46135</guid>
      <dc:creator>SQLShaw</dc:creator>
      <dc:date>2016-11-15T00:45:20Z</dc:date>
    </item>
  </channel>
</rss>

