<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Select nth row in hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115979#M42840</link>
    <description>&lt;P&gt;HI,&lt;/P&gt;&lt;P&gt;I have 5 records in my database and  i want to select the 3rd records.Minus is not possible in hive.So i'm trying Left outer join to perform the query.It's giving me some random results.&lt;/P&gt;&lt;P&gt;Id,Codes&lt;/P&gt;&lt;P&gt;1   100
1   200
2   200
3   200
3   300&lt;/P&gt;&lt;P&gt;select  a.id,b.id from analytical  a inner  join (select  id from  analytical  order by id  desc limit 2) b on a.id=b.id where b.id is  null order by a.id asc limit 3&lt;/P&gt;&lt;P&gt;OUtput:&lt;/P&gt;&lt;P&gt;id,codes&lt;/P&gt;&lt;P&gt;1   NULL
1   NULL
2   NULL&lt;/P&gt;&lt;P&gt;ANy suggestions?&lt;/P&gt;</description>
    <pubDate>Thu, 06 Oct 2016 03:09:06 GMT</pubDate>
    <dc:creator>kumarvaibhav199</dc:creator>
    <dc:date>2016-10-06T03:09:06Z</dc:date>
    <item>
      <title>Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115979#M42840</link>
      <description>&lt;P&gt;HI,&lt;/P&gt;&lt;P&gt;I have 5 records in my database and  i want to select the 3rd records.Minus is not possible in hive.So i'm trying Left outer join to perform the query.It's giving me some random results.&lt;/P&gt;&lt;P&gt;Id,Codes&lt;/P&gt;&lt;P&gt;1   100
1   200
2   200
3   200
3   300&lt;/P&gt;&lt;P&gt;select  a.id,b.id from analytical  a inner  join (select  id from  analytical  order by id  desc limit 2) b on a.id=b.id where b.id is  null order by a.id asc limit 3&lt;/P&gt;&lt;P&gt;OUtput:&lt;/P&gt;&lt;P&gt;id,codes&lt;/P&gt;&lt;P&gt;1   NULL
1   NULL
2   NULL&lt;/P&gt;&lt;P&gt;ANy suggestions?&lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2016 03:09:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115979#M42840</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-10-06T03:09:06Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115980#M42841</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11938/kumarvaibhav1992.html" nodeid="11938"&gt;@Vaibhav Kumar&lt;/A&gt;&lt;P&gt;Two things here.&lt;/P&gt;&lt;P&gt;1. I don't understand your use of "a.id = b.id where b.id is null". When b.id is null, a.id and b.id will never be equal. However, it's your query and you probably know more about it, so you can ignore my comment if you know what I am talking about. &lt;/P&gt;&lt;P&gt;2. I think you need to use ROW_NUMBER function and then select the third row. This &lt;A href="http://thehobt.blogspot.com/2009/02/rownumber-rank-and-denserank.html"&gt;link&lt;/A&gt; describes usage of ROW_NUMBER() for SQL 2005 but it's the same for Hive.&lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2016 03:42:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115980#M42841</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2016-10-06T03:42:52Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115981#M42842</link>
      <description>&lt;P&gt;1. i think  An inner join between the same table twice  with limit 1 will solve this issue .&lt;/P&gt;&lt;P&gt;2 . I cannot go with rownumber  as you can see i have duplicate column values which will not give me correct results&lt;/P&gt;&lt;P&gt;
&lt;A rel="user" href="https://community.cloudera.com/users/10969/mqureshi.html" nodeid="10969"&gt;@mqureshi&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2016 12:18:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115981#M42842</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-10-06T12:18:13Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115982#M42843</link>
      <description>&lt;P&gt;please use below query&lt;/P&gt;&lt;P&gt;with t as (&lt;/P&gt;&lt;P&gt;select row_number() over (order by Codes) as RowNum, ID,Codes&lt;/P&gt;&lt;P&gt;) &lt;/P&gt;&lt;P&gt;select * from t &lt;/P&gt;&lt;P&gt;where RowNum=3;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2016 17:47:37 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115982#M42843</guid>
      <dc:creator>prasoon6k</dc:creator>
      <dc:date>2016-10-06T17:47:37Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115983#M42844</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/13550/prasoon6k.html" nodeid="13550"&gt;@prasoon kumar&lt;/A&gt; with Doesn't work in hive&lt;/P&gt;</description>
      <pubDate>Fri, 07 Oct 2016 02:06:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115983#M42844</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-10-07T02:06:20Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115984#M42845</link>
      <description>&lt;P&gt;Hi Vaibhav,&lt;/P&gt;&lt;P&gt;Which version of hive you are working with?&lt;/P&gt;&lt;P&gt;Please check below link&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Oct 2016 12:26:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115984#M42845</guid>
      <dc:creator>prasoon6k</dc:creator>
      <dc:date>2016-10-07T12:26:35Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115985#M42846</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/11938/kumarvaibhav1992.html"&gt;Vaibhav Kumar&lt;/A&gt;&lt;/P&gt;&lt;P&gt;To re-iterate what @&lt;A href="https://community.hortonworks.com/users/10969/mqureshi.html"&gt;mqureshi&lt;/A&gt; already noticed, your query does not seem functionally correct. 1=1 is true, but null=null is false. Different story.&lt;/P&gt;&lt;P&gt;If you use LIMIT row_count with ORDER BY, Hive,  like MySQL and many other SQL-like engines, ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result.&lt;/P&gt;&lt;P&gt;ROW_NUMBER function and select third row is what you need.&lt;/P&gt;&lt;P&gt;Your idea of inner join will not scale for many records. If you have duplicates then write your query to eliminate the duplicate or deal with that, but one would still wonder how would you determine the true third row when you have a duplicate? I don't see anything in the query you wrote dealing with that problem. I think there is still a lot of work to do on paper before even writing SQL.&lt;/P&gt;</description>
      <pubDate>Sat, 08 Oct 2016 01:25:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115985#M42846</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-10-08T01:25:17Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115986#M42847</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;@Constantin Stanca&lt;/A&gt;  yes it will not work in case in duplicates but in case of distinct values the below worked:&lt;/P&gt;&lt;P&gt;select a.id,b.id from analytical a inner join (select id from analytical order by id desc limit 3) b on a.id=b.id ORDER BY A.ID ASC LIMIT 1 &lt;/P&gt;</description>
      <pubDate>Sat, 08 Oct 2016 02:24:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115986#M42847</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-10-08T02:24:54Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115987#M42848</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/11938/kumarvaibhav1992.html"&gt;@Vaibhav Kumar &lt;/A&gt;&lt;/P&gt;&lt;P&gt;No question that this could work, just a bit concerned with performance. Order by is a very expensive operation involving a single reducer. Practically, you do one full table scan for that and once you get those three records you do another table scan to get the attributes of those records.&lt;/P&gt;&lt;P&gt;Look at an alternative here: &lt;A href="https://community.hortonworks.com/questions/24667/hive-top-n-records-within-a-group.html"&gt;https://community.hortonworks.com/questions/24667/hive-top-n-records-within-a-group.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;If any response in this thread was helpful, don't forget to vote/accept the best response.&lt;/P&gt;</description>
      <pubDate>Sat, 08 Oct 2016 08:07:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115987#M42848</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-10-08T08:07:50Z</dc:date>
    </item>
    <item>
      <title>Re: Select nth row in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115988#M42849</link>
      <description>&lt;P&gt;I'm not sure about my data because here  even composite keys can produce duplicates. so going by analytical function is not a good choice for me.Any how query is not taking that much time for me.&lt;/P&gt;&lt;P&gt;I voted up for your Solution .Thanks For your Response. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;@Constantin Stanca&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sun, 09 Oct 2016 02:34:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Select-nth-row-in-hive/m-p/115988#M42849</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-10-09T02:34:38Z</dc:date>
    </item>
  </channel>
</rss>

