<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Hive Primary key on partitioned column in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209775#M62884</link>
    <description>&lt;P&gt;The example I gave was a trimmed-down version of what I wanted to do to show the technical problem. &lt;/P&gt;&lt;P&gt;My expected PK is actually a compound PK, with a few partitioned columns and a few non-partitioned columns. &lt;/P&gt;&lt;P&gt;But I am afraid that your answer says it all, no can do :(.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Wed, 14 Jun 2017 21:10:47 GMT</pubDate>
    <dc:creator>guillaume_roger</dc:creator>
    <dc:date>2017-06-14T21:10:47Z</dc:date>
    <item>
      <title>Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209772#M62881</link>
      <description>&lt;P&gt;I want to add primary key constraints to hive tables. The only think is that my PK is actually a partitioned column. For instance:&lt;/P&gt;&lt;PRE&gt;CREATE TABLE pk 
(
  id INT, 
  PRIMARY KEY(part) DISABLE NOVALIDATE
)
PARTITIONED BY (part STRING)&lt;/PRE&gt;&lt;P&gt;This fails with the error message:&lt;/P&gt;&lt;PRE&gt;DBCException: SQL Error [10002] [42000]: Error while compiling statement: FAILED: SemanticException [Error 10002]: Invalid column reference part&lt;/PRE&gt;&lt;P&gt;Is there a way to use a partitioned column as PK?&lt;/P&gt;&lt;P&gt;Context: hdp 2.6, hive 2.1 with llap.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2017 17:54:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209772#M62881</guid>
      <dc:creator>guillaume_roger</dc:creator>
      <dc:date>2017-06-14T17:54:14Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209773#M62882</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/13690/guillaumeroger.html" nodeid="13690"&gt;@Guillaume Roger&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I don't think we can create partition of primary column. To add few things on top of it, if you create partition based on primary key then there will be only one record placed under each partition which will end up in 'N' of partitions. Suppose if you have 10K records then it will be chaos with that much partition on primary keys. Hope it helps!&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2017 18:45:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209773#M62882</guid>
      <dc:creator>balavignesh_nag</dc:creator>
      <dc:date>2017-06-14T18:45:22Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209774#M62883</link>
      <description>&lt;P&gt;A side note: You should not partition on any columns with high cardinality such as IDs.  You would use bucketing instead&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2017 18:57:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209774#M62883</guid>
      <dc:creator>egarelnabi</dc:creator>
      <dc:date>2017-06-14T18:57:41Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209775#M62884</link>
      <description>&lt;P&gt;The example I gave was a trimmed-down version of what I wanted to do to show the technical problem. &lt;/P&gt;&lt;P&gt;My expected PK is actually a compound PK, with a few partitioned columns and a few non-partitioned columns. &lt;/P&gt;&lt;P&gt;But I am afraid that your answer says it all, no can do :(.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2017 21:10:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209775#M62884</guid>
      <dc:creator>guillaume_roger</dc:creator>
      <dc:date>2017-06-14T21:10:47Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209776#M62885</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/13690/guillaumeroger.html" nodeid="13690"&gt;@Guillaume Roger&lt;/A&gt; &lt;/P&gt;&lt;P&gt;I'm not sure whether my understanding is correct based on your reply. If you have compound keys then there are work around available to make it possible. Load the data with concat(compound keys) along with the separate fields into a stage table. For the stage table you have the option of defining hte primary key as well as partition based on the other fields which are used in a compound key creation.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Jun 2017 22:23:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209776#M62885</guid>
      <dc:creator>balavignesh_nag</dc:creator>
      <dc:date>2017-06-14T22:23:35Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Primary key on partitioned column</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209777#M62886</link>
      <description>&lt;P&gt;Thanks, but I am not interested in this surrogate key. The point of defining the PK was to help eg. reporting tools to find out automatically joins between tables. This surrogate key would thus not do.&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 15 Jun 2017 12:34:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-Primary-key-on-partitioned-column/m-p/209777#M62886</guid>
      <dc:creator>guillaume_roger</dc:creator>
      <dc:date>2017-06-15T12:34:15Z</dc:date>
    </item>
  </channel>
</rss>

