<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Impala external table partition files added over time - How can I broadcast a REFRESH to all nod in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Impala-external-table-partition-files-added-over-time-How/m-p/17038#M2605</link>
    <description>Hi,&lt;BR /&gt;The issuing a REFRESH will broadcast the updated table to all nodes. There&lt;BR /&gt;may be a short delay after the command completes before the changes are&lt;BR /&gt;visible on remote nodes. This is because we must wait for a statestore&lt;BR /&gt;update to propagate the changes out.&lt;BR /&gt;&lt;BR /&gt;If you need stronger consistency guarantees (the change is visible on all&lt;BR /&gt;nodes at the time it completes) you can use the query option:&lt;BR /&gt;SET SYNC_DDL=true&lt;BR /&gt;&lt;BR /&gt;This will impact DDL performance, so it is disabled by default. For optimal&lt;BR /&gt;performance, you can batch your DDL statements and only enable this query&lt;BR /&gt;option for the final statement in the batch. For example:&lt;BR /&gt;&lt;BR /&gt;&amp;gt; connect to node 1&lt;BR /&gt;CREATE TABLE Foo&lt;BR /&gt;ALTER TABLE ADD PARTITION 1&lt;BR /&gt;..&lt;BR /&gt;SET SYNC_DDL=true&lt;BR /&gt;ALTER TABLE ADD PARTITION N&lt;BR /&gt;&lt;BR /&gt;&amp;gt; connect to node2&lt;BR /&gt;SELECT * FROM TABLE&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Lenni&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Fri, 15 Aug 2014 19:13:56 GMT</pubDate>
    <dc:creator>lskuff</dc:creator>
    <dc:date>2014-08-15T19:13:56Z</dc:date>
    <item>
      <title>Impala external table partition files added over time - How can I broadcast a REFRESH to all nodes?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Impala-external-table-partition-files-added-over-time-How/m-p/17026#M2604</link>
      <description>&lt;P&gt;I have a scenario where parquet files are being added over time to a partition of an Impala external table. The catalog service broadcasts changed metadata as a result of ALTER TABLE, INSERT and LOAD DATA to all nodes but it is not aware obviously when I create a parquet file and add it to a partition. Is there a way to force the catalog service to broadcast to all nodes and cause each of them to refresh their metadata? I'm trying to avoid having to connect to every impala daemon (any of which can be connected to and used as a coordinator) to&amp;nbsp;issue a&amp;nbsp;REFRESH.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:05:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Impala-external-table-partition-files-added-over-time-How/m-p/17026#M2604</guid>
      <dc:creator>HariSeldon</dc:creator>
      <dc:date>2022-09-16T09:05:19Z</dc:date>
    </item>
    <item>
      <title>Re: Impala external table partition files added over time - How can I broadcast a REFRESH to all nod</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Impala-external-table-partition-files-added-over-time-How/m-p/17038#M2605</link>
      <description>Hi,&lt;BR /&gt;The issuing a REFRESH will broadcast the updated table to all nodes. There&lt;BR /&gt;may be a short delay after the command completes before the changes are&lt;BR /&gt;visible on remote nodes. This is because we must wait for a statestore&lt;BR /&gt;update to propagate the changes out.&lt;BR /&gt;&lt;BR /&gt;If you need stronger consistency guarantees (the change is visible on all&lt;BR /&gt;nodes at the time it completes) you can use the query option:&lt;BR /&gt;SET SYNC_DDL=true&lt;BR /&gt;&lt;BR /&gt;This will impact DDL performance, so it is disabled by default. For optimal&lt;BR /&gt;performance, you can batch your DDL statements and only enable this query&lt;BR /&gt;option for the final statement in the batch. For example:&lt;BR /&gt;&lt;BR /&gt;&amp;gt; connect to node 1&lt;BR /&gt;CREATE TABLE Foo&lt;BR /&gt;ALTER TABLE ADD PARTITION 1&lt;BR /&gt;..&lt;BR /&gt;SET SYNC_DDL=true&lt;BR /&gt;ALTER TABLE ADD PARTITION N&lt;BR /&gt;&lt;BR /&gt;&amp;gt; connect to node2&lt;BR /&gt;SELECT * FROM TABLE&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Lenni&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 15 Aug 2014 19:13:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Impala-external-table-partition-files-added-over-time-How/m-p/17038#M2605</guid>
      <dc:creator>lskuff</dc:creator>
      <dc:date>2014-08-15T19:13:56Z</dc:date>
    </item>
  </channel>
</rss>

