<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Query Runs slow on hive when using NOT LIKE in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/43104#M34691</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank You So Much ! Now the query run time is decreased by 30 mins.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Janani&lt;/P&gt;</description>
    <pubDate>Fri, 22 Jul 2016 09:18:00 GMT</pubDate>
    <dc:creator>JananiViswa1</dc:creator>
    <dc:date>2016-07-22T09:18:00Z</dc:date>
    <item>
      <title>Query Runs slow on hive when using NOT LIKE</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/42849#M34689</link>
      <description>&lt;P&gt;Hello Everyone!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;On&amp;nbsp;Hive 1.1.0&amp;nbsp;the following query executes for one hour &amp;amp; the table&amp;nbsp;has nearly 2 billion records.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;select&lt;/P&gt;&lt;P&gt;C1,&lt;/P&gt;&lt;P&gt;C2,&lt;/P&gt;&lt;P&gt;C3,&lt;/P&gt;&lt;P&gt;C4,&lt;/P&gt;&lt;P&gt;from &amp;lt;tablename&amp;gt;&lt;/P&gt;&lt;P&gt;where NOT ( instr(FileName,'&amp;lt;sometext&amp;gt;')&amp;gt; 0&lt;/P&gt;&lt;P&gt;or instr(FileName,'&amp;lt;sometext&amp;gt;')&amp;gt; 0&lt;/P&gt;&lt;P&gt;or instr(FileName,'&amp;lt;sometext&amp;gt;')&amp;gt;0&lt;/P&gt;&lt;P&gt;or FileName='' )&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;and FILENAME not like '%&amp;lt;sometext&amp;gt;%'&lt;/P&gt;&lt;P&gt;..........;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Kindly suggest some better ways to optimise the above query.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks In Advance,&lt;/P&gt;&lt;P&gt;Janani&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:29:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/42849#M34689</guid>
      <dc:creator>JananiViswa1</dc:creator>
      <dc:date>2022-09-16T10:29:56Z</dc:date>
    </item>
    <item>
      <title>Re: Query Runs slow on hive when using NOT LIKE</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/43095#M34690</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/17487"&gt;@JananiViswa1﻿&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;From the first galance of this problem, I can see that you have a lot of "like" operators in your query. An like operator incurs regular expression matching, which is very costive, and may cause slowness to the query.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Have you noticed where the slowness happens? Is it within Hive itself, or is it just the MR job runs for a long time? If it is the MR job that slows everything down, please consider reducing the split size of the job and thus using more mappers to process the input data. To do this, please run below commands before the query:&lt;/P&gt;&lt;PRE&gt;set mapred.min.split.size=63000000;
set mapred.max.split.size=64000000;&lt;/PRE&gt;&lt;P&gt;If my assumption is wrong, or&amp;nbsp;you still have problem after applying above change,&amp;nbsp;please give me more info so that I can investigate further:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1) Hive log&lt;/P&gt;&lt;P&gt;If you use Hive CLI, please give us the command output&lt;/P&gt;&lt;P&gt;If you use HS2, please give us the HS2 log file or relevant information in it&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2) mapreduce job configuration and log file&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3) The definition of your source table (output of "show create table &amp;lt;tbl_name&amp;gt;")&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jul 2016 01:05:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/43095#M34690</guid>
      <dc:creator>yshi</dc:creator>
      <dc:date>2016-07-22T01:05:56Z</dc:date>
    </item>
    <item>
      <title>Re: Query Runs slow on hive when using NOT LIKE</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/43104#M34691</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank You So Much ! Now the query run time is decreased by 30 mins.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Janani&lt;/P&gt;</description>
      <pubDate>Fri, 22 Jul 2016 09:18:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Query-Runs-slow-on-hive-when-using-NOT-LIKE/m-p/43104#M34691</guid>
      <dc:creator>JananiViswa1</dc:creator>
      <dc:date>2016-07-22T09:18:00Z</dc:date>
    </item>
  </channel>
</rss>

