<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question org.apache.hadoop.hive.ql.metadata.HiveException dissapears when I add LIMIT clause to query in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-hadoop-hive-ql-metadata-HiveException-dissapears/m-p/138390#M35305</link>
    <description>&lt;P&gt;I have a query that
inserts records from one table to another, 1.5 million records approx. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5900-query1.jpg" style="width: 1073px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21909i1DE8F17565166631/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5900-query1.jpg" alt="5900-query1.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When I run the query most
of the map tasks fails and the job aborts displaying several exceptions. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5899-hive-ex.jpg" style="width: 1256px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21910iF60E81B4D641C9CE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5899-hive-ex.jpg" alt="5899-hive-ex.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;There is no problem at
all in the data being inserted or in the table schema, as the failed rows can
be successfully inserted in an isolated query. Datanodes and yarn have plenty of
space left, connectivity is not a problem as the cluster is hosted in aws ec2
and the pcs are in the same virtual private cloud.&lt;/P&gt;&lt;P&gt;But here is the
strange thing… &lt;STRONG&gt;If I add a LIMIT clause to the original query the job executes properly!&lt;/STRONG&gt;
The limit value is big enough to include all the records, so in reality there
is no actual limit. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5912-hive-succ.jpg" style="width: 689px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21911iAA7B811B53F3234E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5912-hive-succ.jpg" alt="5912-hive-succ.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Cluster Specs:&lt;/EM&gt;

The cluster is small for testing purposes, composed by 3 machines, the
ambari-server and 2 datanodes/nodemanagers 8GB RAM each, YARN memory 12.5GB.
Remaining HDFS disk 40GB.&lt;/P&gt;&lt;P&gt;Thanks in advance for
your time,&lt;/P&gt;</description>
    <pubDate>Mon, 19 Aug 2019 08:37:26 GMT</pubDate>
    <dc:creator>ezequiel</dc:creator>
    <dc:date>2019-08-19T08:37:26Z</dc:date>
    <item>
      <title>org.apache.hadoop.hive.ql.metadata.HiveException dissapears when I add LIMIT clause to query</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-hadoop-hive-ql-metadata-HiveException-dissapears/m-p/138390#M35305</link>
      <description>&lt;P&gt;I have a query that
inserts records from one table to another, 1.5 million records approx. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5900-query1.jpg" style="width: 1073px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21909i1DE8F17565166631/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5900-query1.jpg" alt="5900-query1.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;When I run the query most
of the map tasks fails and the job aborts displaying several exceptions. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5899-hive-ex.jpg" style="width: 1256px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21910iF60E81B4D641C9CE/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5899-hive-ex.jpg" alt="5899-hive-ex.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;There is no problem at
all in the data being inserted or in the table schema, as the failed rows can
be successfully inserted in an isolated query. Datanodes and yarn have plenty of
space left, connectivity is not a problem as the cluster is hosted in aws ec2
and the pcs are in the same virtual private cloud.&lt;/P&gt;&lt;P&gt;But here is the
strange thing… &lt;STRONG&gt;If I add a LIMIT clause to the original query the job executes properly!&lt;/STRONG&gt;
The limit value is big enough to include all the records, so in reality there
is no actual limit. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="5912-hive-succ.jpg" style="width: 689px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21911iAA7B811B53F3234E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="5912-hive-succ.jpg" alt="5912-hive-succ.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Cluster Specs:&lt;/EM&gt;

The cluster is small for testing purposes, composed by 3 machines, the
ambari-server and 2 datanodes/nodemanagers 8GB RAM each, YARN memory 12.5GB.
Remaining HDFS disk 40GB.&lt;/P&gt;&lt;P&gt;Thanks in advance for
your time,&lt;/P&gt;</description>
      <pubDate>Mon, 19 Aug 2019 08:37:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-hadoop-hive-ql-metadata-HiveException-dissapears/m-p/138390#M35305</guid>
      <dc:creator>ezequiel</dc:creator>
      <dc:date>2019-08-19T08:37:26Z</dc:date>
    </item>
    <item>
      <title>Re: org.apache.hadoop.hive.ql.metadata.HiveException dissapears when I add LIMIT clause to query</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-hadoop-hive-ql-metadata-HiveException-dissapears/m-p/138391#M35306</link>
      <description>&lt;P&gt;Looks like your datanodes are dying from too many open files - check the nofiles setting for the "hdfs" user in /etc/security/limits.d/

If you want to bypass that particular problem by changing the query plan, try with 

set hive.optimize.sort.dynamic.partition=true;&lt;/P&gt;</description>
      <pubDate>Thu, 21 Jul 2016 01:14:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/org-apache-hadoop-hive-ql-metadata-HiveException-dissapears/m-p/138391#M35306</guid>
      <dc:creator>gopalv</dc:creator>
      <dc:date>2016-07-21T01:14:37Z</dc:date>
    </item>
  </channel>
</rss>

