<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: top function in pig/hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126506#M17992</link>
    <description>&lt;P&gt;thanks a lot.&lt;/P&gt;</description>
    <pubDate>Sat, 06 Feb 2016 07:44:23 GMT</pubDate>
    <dc:creator>priyankavijayak</dc:creator>
    <dc:date>2016-02-06T07:44:23Z</dc:date>
    <item>
      <title>top function in pig/hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126503#M17989</link>
      <description>&lt;P&gt;In a dataset (approx. 2 lakh records), there is coloumn named tags (  comma separated list of tags associated with question. examples of tags are "html","error" etc so on . &lt;/P&gt;&lt;P&gt;php,error,gd,image-processing&lt;/P&gt;&lt;P&gt;php,error,gd,image-processing&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;lisp,scheme,subjective,clojure&lt;/P&gt;&lt;P&gt;cocoa-touch,objective-c,design-patterns&lt;/P&gt;&lt;P&gt;cocoa-touch,objective-c,design-patterns&lt;/P&gt;&lt;P&gt;cocoa-touch,objective-c,design-patterns&lt;/P&gt;&lt;P&gt;core-animation&lt;/P&gt;&lt;P&gt;django,django-models&lt;/P&gt;&lt;P&gt;django,django-models&lt;/P&gt;&lt;P&gt;aspÃ»net&lt;/P&gt;&lt;P&gt;scala,pattern-matching,oop,object-oriented-design,design-principles&lt;/P&gt;&lt;P&gt;scala,pattern-matching,oop,object-oriented-design,design-principles&lt;/P&gt;&lt;P&gt;scala,pattern-matching,oop,object-oriented-design,design-principles&lt;/P&gt;&lt;P&gt;. . . . .&lt;/P&gt;&lt;P&gt;how to find top 10 most commonly used tags in dataset? in pig or hive &lt;/P&gt;</description>
      <pubDate>Fri, 05 Feb 2016 04:46:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126503#M17989</guid>
      <dc:creator>priyankavijayak</dc:creator>
      <dc:date>2016-02-05T04:46:07Z</dc:date>
    </item>
    <item>
      <title>Re: top function in pig/hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126504#M17990</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2601/priyankavijayakumar11.html" nodeid="2601"&gt;@priyanka vijayakumar&lt;/A&gt; good word count tutorial &lt;A href="http://hortonworks.com/hadoop-tutorial/word-counting-with-apache-pig/"&gt;link&lt;/A&gt;. It uses Pig, Hcatalog and Hive, you will be better off with the combination of these. &lt;/P&gt;</description>
      <pubDate>Fri, 05 Feb 2016 04:50:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126504#M17990</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-05T04:50:30Z</dc:date>
    </item>
    <item>
      <title>Re: top function in pig/hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126505#M17991</link>
      <description>&lt;P&gt;Here is a &lt;A href="https://en.wikipedia.org/wiki/Pig_%28programming_tool%29"&gt;Pig word count&lt;/A&gt; with comments. Give the delimiter to TOKENIZE, in you case comma: TOKENIZE(line,','). You might have to select a different filter based on your input. You can start by commenting the filter out and adding it later if needed. Finally, to extract only 10 top entries you can use LIMIT: top10 = LIMIT ordered_word_count, 10. Be sure to inspect the stored file and make sure words (tags) have been properly tokenized. If not, add a filter mentioned above.&lt;/P&gt;</description>
      <pubDate>Fri, 05 Feb 2016 05:51:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126505#M17991</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-02-05T05:51:16Z</dc:date>
    </item>
    <item>
      <title>Re: top function in pig/hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126506#M17992</link>
      <description>&lt;P&gt;thanks a lot.&lt;/P&gt;</description>
      <pubDate>Sat, 06 Feb 2016 07:44:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126506#M17992</guid>
      <dc:creator>priyankavijayak</dc:creator>
      <dc:date>2016-02-06T07:44:23Z</dc:date>
    </item>
    <item>
      <title>Re: top function in pig/hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126507#M17993</link>
      <description>&lt;P&gt;thanks a lot.&lt;/P&gt;</description>
      <pubDate>Sat, 06 Feb 2016 07:44:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/top-function-in-pig-hive/m-p/126507#M17993</guid>
      <dc:creator>priyankavijayak</dc:creator>
      <dc:date>2016-02-06T07:44:33Z</dc:date>
    </item>
  </channel>
</rss>

