<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Impala Metadata Sync Issue Disk I/O error datanode.fqdn:22000: Failed to open HDFS file in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365196#M239303</link>
    <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are facing an issue where no matter what we try Impala queries will randomly throw a "Failed to open HDFS file" error. This seemingly started out of nowhere and we are not sure what else to try.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below are some of the things we have tried.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Enforce SYNC_DDL&lt;/P&gt;&lt;P&gt;2. We used to have 87 impala daemons (both executor and coordinator). We setup dedicated coordinators for Impala (4 coordinator + 83 executors) and load balanced with haproxy.&lt;/P&gt;&lt;P&gt;3. Tried adding invalidate metadata, and then removing it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is the sequence of queries.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Insert Overwrite a table. (approx every 1 hour)&lt;/P&gt;&lt;P&gt;2. Refresh&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. Compute stats.&lt;/P&gt;&lt;P&gt;4. Select.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The select never fails on the same coordinator as insert, but randomly on other coordinators. And it keeps failing until a refresh. As soon as a refresh is run on the other failing coordinator, query succeeds.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This leads me to believe it is a metadata sync issue across coordinators. The problem is that multiple applications/dashboards are using Impala and we cannot ask them to do a refresh every time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;impalad version 3.2.0-cdh6.3.3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help is appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;SohamR&lt;/P&gt;</description>
    <pubDate>Fri, 03 Mar 2023 06:20:57 GMT</pubDate>
    <dc:creator>SohamR</dc:creator>
    <dc:date>2023-03-03T06:20:57Z</dc:date>
    <item>
      <title>Impala Metadata Sync Issue Disk I/O error datanode.fqdn:22000: Failed to open HDFS file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365196#M239303</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are facing an issue where no matter what we try Impala queries will randomly throw a "Failed to open HDFS file" error. This seemingly started out of nowhere and we are not sure what else to try.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below are some of the things we have tried.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Enforce SYNC_DDL&lt;/P&gt;&lt;P&gt;2. We used to have 87 impala daemons (both executor and coordinator). We setup dedicated coordinators for Impala (4 coordinator + 83 executors) and load balanced with haproxy.&lt;/P&gt;&lt;P&gt;3. Tried adding invalidate metadata, and then removing it.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Below is the sequence of queries.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Insert Overwrite a table. (approx every 1 hour)&lt;/P&gt;&lt;P&gt;2. Refresh&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. Compute stats.&lt;/P&gt;&lt;P&gt;4. Select.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The select never fails on the same coordinator as insert, but randomly on other coordinators. And it keeps failing until a refresh. As soon as a refresh is run on the other failing coordinator, query succeeds.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This leads me to believe it is a metadata sync issue across coordinators. The problem is that multiple applications/dashboards are using Impala and we cannot ask them to do a refresh every time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;impalad version 3.2.0-cdh6.3.3&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help is appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;SohamR&lt;/P&gt;</description>
      <pubDate>Fri, 03 Mar 2023 06:20:57 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365196#M239303</guid>
      <dc:creator>SohamR</dc:creator>
      <dc:date>2023-03-03T06:20:57Z</dc:date>
    </item>
    <item>
      <title>Re: Impala Metadata Sync Issue Disk I/O error datanode.fqdn:22000: Failed to open HDFS file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365331#M239320</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Just something I have noticed. Whenever we SYNC_DDL and try to run a refresh, sometimes the query does not even register and produce a queryID, and at the same time I see below errors in the coordinator logs.:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I0304 18:53:17.278281 174197 thrift-util.cc:124] TAcceptQueueServer: Caught TException: SSL_read: Connection reset by peer&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Does this point to any actual Network/SSL error? Any insights would be helpful.&lt;BR /&gt;&lt;BR /&gt;Regards&lt;BR /&gt;SohamR&lt;/P&gt;</description>
      <pubDate>Sat, 04 Mar 2023 17:55:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365331#M239320</guid>
      <dc:creator>SohamR</dc:creator>
      <dc:date>2023-03-04T17:55:02Z</dc:date>
    </item>
    <item>
      <title>Re: Impala Metadata Sync Issue Disk I/O error datanode.fqdn:22000: Failed to open HDFS file</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365731#M239380</link>
      <description>&lt;P&gt;Hi All,&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Here is an example of an even worse scenario.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;INSERT OVERWRITE (with SYNC_DDL) took approx 114mins : 03/09/2023 3:00 AM - 03/09/2023 4:53 AM 2. From 03/09/2023 4:53 AM - 03/09/2023 6:02 AM, all selects failed, for over 1 hour.&lt;/LI&gt;&lt;LI&gt;In the script, there is a COMPUTE stats just after the INSERT OVERWRITE. Even though the INSERT OVERWRITE completed by 03/09/2023 4:53 AM, the COMPUTE STATS did not start until 03/09/2023 6:02 AM.&lt;/LI&gt;&lt;LI&gt;Was it waiting for SYNC_DDL to complete before starting the next DDL query? If yes, then why did INSERT OVERWRITE complete before SYNC_DDL was complete?&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Can anyone please help with any ideas?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;SohamR&lt;/P&gt;</description>
      <pubDate>Thu, 09 Mar 2023 06:12:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Impala-Metadata-Sync-Issue-Disk-I-O-error-datanode-fqdn/m-p/365731#M239380</guid>
      <dc:creator>SohamR</dc:creator>
      <dc:date>2023-03-09T06:12:54Z</dc:date>
    </item>
  </channel>
</rss>

