<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Creating Indexes in Hive in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149603#M112121</link>
    <description>&lt;P&gt;The short answer is no. Indexes in Hive are not recommended.&lt;/P&gt;&lt;P&gt;The reason for this is ORC. ORC has build in Indexes which allow the format to skip blocks of data during read, they also support Bloom filters. Together this pretty much replicates what Hive Indexes did and they do it automatically in the data format without the need to manage an external table ( which is essentially what happens in indexes. ). I would rather spend my time to properly setup the ORC tables. &lt;/P&gt;&lt;P&gt;Again shameless plug:&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data"&gt;http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 19 Feb 2016 17:33:15 GMT</pubDate>
    <dc:creator>bleonhardi</dc:creator>
    <dc:date>2016-02-19T17:33:15Z</dc:date>
    <item>
      <title>Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149601#M112119</link>
      <description>&lt;P&gt;Is creating indexes on hive table recommended? &lt;/P&gt;&lt;P&gt;&lt;A href="http://www.slideshare.net/ye.mikez/hive-tuning?next_slideshow=1" target="_blank"&gt;http://www.slideshare.net/ye.mikez/hive-tuning?next_slideshow=1&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It sort of suggests that creating indexing should be avoided. Just want some thought from the community on this.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Feb 2016 15:30:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149601#M112119</guid>
      <dc:creator>sdutta</dc:creator>
      <dc:date>2016-02-19T15:30:25Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149602#M112120</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/137/sdutta.html"&gt;Shivaji&lt;/A&gt;, Have you checked below links, it had given information about when to avoid using indexing in hive:&lt;/P&gt;&lt;P&gt;&lt;A href="https://acadgild.com/blog/indexing-in-hive/" target="_blank"&gt;https://acadgild.com/blog/indexing-in-hive/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;Another link which has given some useful information about Indexing in Hive:&lt;/P&gt;&lt;P&gt;&lt;A href="http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=5E97F9C310D5978ED19CF9F0E96D2407?doi=10.1.1.633.4589&amp;amp;rep=rep1&amp;amp;type=pdf" target="_blank"&gt;http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=5E97F9C310D5978ED19CF9F0E96D2407?doi=10.1.1.633.4589&amp;amp;rep=rep1&amp;amp;type=pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;or search&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;A href="https://www.google.co.in/url?sa=t&amp;amp;rct=j&amp;amp;q=&amp;amp;esrc=s&amp;amp;source=web&amp;amp;cd=18&amp;amp;cad=rja&amp;amp;uact=8&amp;amp;ved=0ahUKEwju7ZTCrIPLAhWHmpQKHVLzDrE4ChAWCEkwBw&amp;amp;url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Bjsessionid%3D5E97F9C310D5978ED19CF9F0E96D2407%3Fdoi%3D10.1.1.633.4589%26rep%3Drep1%26type%3Dpdf&amp;amp;usg=AFQjCNEVKBf-jKpya3AV5EZtp9OJ84oWzg&amp;amp;sig2=CSgz9njfVmVII9P5O8iWcw&amp;amp;bvm=bv.114733917,d.dGo"&gt;index-based join operations in hive - CiteSeer&lt;/A&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;H3&gt;-&lt;/H3&gt;&lt;P&gt;Hope it help you get required information to decide whether to use Indexes in Hive or not?&lt;/P&gt;</description>
      <pubDate>Fri, 19 Feb 2016 16:03:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149602#M112120</guid>
      <dc:creator>rushikeshdeshmu</dc:creator>
      <dc:date>2016-02-19T16:03:24Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149603#M112121</link>
      <description>&lt;P&gt;The short answer is no. Indexes in Hive are not recommended.&lt;/P&gt;&lt;P&gt;The reason for this is ORC. ORC has build in Indexes which allow the format to skip blocks of data during read, they also support Bloom filters. Together this pretty much replicates what Hive Indexes did and they do it automatically in the data format without the need to manage an external table ( which is essentially what happens in indexes. ). I would rather spend my time to properly setup the ORC tables. &lt;/P&gt;&lt;P&gt;Again shameless plug:&lt;/P&gt;&lt;P&gt;&lt;A href="http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data"&gt;http://www.slideshare.net/BenjaminLeonhardi/hive-loading-data&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Feb 2016 17:33:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149603#M112121</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-02-19T17:33:15Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149604#M112122</link>
      <description>&lt;P&gt;@shivaji, If the original question is answered then please accept the best answer.&lt;/P&gt;</description>
      <pubDate>Tue, 23 Feb 2016 13:30:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149604#M112122</guid>
      <dc:creator>rushikeshdeshmu</dc:creator>
      <dc:date>2016-02-23T13:30:29Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149605#M112123</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/137/sdutta.html" nodeid="137"&gt;@Shivaji&lt;/A&gt; I agree with Benjamin. Hive indexes is not recommended. &lt;/P&gt;</description>
      <pubDate>Tue, 23 Feb 2016 13:38:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149605#M112123</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-23T13:38:42Z</dc:date>
    </item>
    <item>
      <title>Re: Creating Indexes in Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149606#M112124</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/168/bleonhardi.html" nodeid="168"&gt;@Benjamin Leonhardi&lt;/A&gt; , on slide 24 you notate that a small stripe size indicates a memory problem during load.  Do you know what memory problem that would be?  I have ~ 3500 records on the stripe and was just wondering where I should look.  Thanks!&lt;/P&gt;</description>
      <pubDate>Sat, 25 Mar 2017 13:18:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-Indexes-in-Hive/m-p/149606#M112124</guid>
      <dc:creator>james1</dc:creator>
      <dc:date>2017-03-25T13:18:50Z</dc:date>
    </item>
  </channel>
</rss>

