<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Primary key partition handling from HAWQ to hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131442#M39424</link>
    <description>&lt;P&gt;It is pretty simple to configure:&lt;/P&gt;&lt;P&gt;&lt;A href="http://hdb.docs.pivotal.io/20/pxf/ConfigurePXF.html#topic_i3f_hvm_ss" target="_blank"&gt;http://hdb.docs.pivotal.io/20/pxf/ConfigurePXF.html#topic_i3f_hvm_ss&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 01 Sep 2016 22:32:53 GMT</pubDate>
    <dc:creator>jroberts</dc:creator>
    <dc:date>2016-09-01T22:32:53Z</dc:date>
    <item>
      <title>Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131434#M39416</link>
      <description>&lt;P&gt;I understand HAWQ can handle Primary key partition. In the HAWQ to Hive migration what is the best suited approach to handle data ingestion ? &lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:37:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131434#M39416</guid>
      <dc:creator>yjagadeesan</dc:creator>
      <dc:date>2022-09-16T10:37:39Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131435#M39417</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/1929/yjagadeesan.html" nodeid="1929"&gt;@Yogeshprabhu&lt;/A&gt;&lt;P&gt;Move the HAWQ primary key check constraint to the data ingestion script. suppose in case of sqoop, use custom Query handler to get only the check constraint data and create a child table in Hive. In this way you can acheive same schema structure between HAWQ and HIVE.&lt;/P&gt;&lt;PRE&gt;CONSTRAINT rank_1_prt_2_check CHECK (year &amp;gt;= 2001 AND year &amp;lt; 2002)
)
INHERITS ("Test".rank)



Move this constraint to the Sqoop Script condition create a separate hive tables for each HAWQ child tables.&lt;/PRE&gt;</description>
      <pubDate>Thu, 01 Sep 2016 20:07:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131435#M39417</guid>
      <dc:creator>njayakumar</dc:creator>
      <dc:date>2016-09-01T20:07:15Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131436#M39418</link>
      <description>&lt;P&gt;Thanks &lt;A rel="user" href="https://community.cloudera.com/users/12353/njayakumar.html" nodeid="12353"&gt;@njayakumar&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Incase of nested partitions how will that be handled ? &lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 20:09:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131436#M39418</guid>
      <dc:creator>yjagadeesan</dc:creator>
      <dc:date>2016-09-01T20:09:01Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131437#M39419</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1929/yjagadeesan.html" nodeid="1929"&gt;@Yogeshprabhu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Multiple nested sub partition from HAWQ to HIVE using sqoop will be challenging, if you need to implement then we need to use SQOOP2 API's. I would recommend to import the table as it is with one parent partition in to HDFS. Create a external table and migrate it to internal table with necessary required partition.  &lt;/P&gt;&lt;P&gt;Please remember having large partition with small amount of data @ hive might hinder the performance.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 20:13:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131437#M39419</guid>
      <dc:creator>njayakumar</dc:creator>
      <dc:date>2016-09-01T20:13:34Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131438#M39420</link>
      <description>&lt;P&gt;What is a "Primary key partition"?  I've never heard of that before. &lt;/P&gt;&lt;P&gt;- HAWQ doesn't support indexes so it doesn't support a Primary Key constraint.  &lt;/P&gt;&lt;P&gt;- HAWQ does have table partitioning which can be a list or a range of values.  Most commonly, a partition will be based on a date or timestamp column and have a range of entire month, quarter, or year.  This is similar to Hive partitioning but Hive can't partition on a range of values like HAWQ can.&lt;/P&gt;&lt;P&gt;- HAWQ also has table distribution which can be either random or a hash of a column or columns.  With HAWQ 2.0, it is recommended to use random distribution.&lt;/P&gt;&lt;P&gt;So your question is how to migrate data from HAWQ to Hive.  First off, sqoop would be pretty slow.  It would be a single process to unload data.  I would never recommend using sqoop for something like this.  Instead, you should use a Writable External Table in HAWQ that writes, in parallel, directly to HDFS.  &lt;/P&gt;&lt;P&gt;&lt;A href="http://hdb.docs.pivotal.io/20/pxf/PXFExternalTableandAPIReference.html" target="_blank"&gt;http://hdb.docs.pivotal.io/20/pxf/PXFExternalTableandAPIReference.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 21:43:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131438#M39420</guid>
      <dc:creator>jroberts</dc:creator>
      <dc:date>2016-09-01T21:43:06Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131439#M39421</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11761/jroberts.html" nodeid="11761"&gt;@Jon Roberts&lt;/A&gt;&lt;P&gt;sqoop can be run in parallel based on split by coloumn id or externally providing the number of mapper. &lt;/P&gt;&lt;P&gt;Majority of the places, HAWQ will be managed by a different team, creating the external table involves lot of process changes.&lt;/P&gt;&lt;P&gt;Not sure how HAWQ will handle HDFS write, in case of secured cluster.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 21:49:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131439#M39421</guid>
      <dc:creator>njayakumar</dc:creator>
      <dc:date>2016-09-01T21:49:24Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131440#M39422</link>
      <description>&lt;P&gt;HAWQ's external table supports secured clusters.  But I would always prefer to just create a single external table and then run a typical "INSERT INTO ext_table SELECT * FROM hawq_table;" because it would be so much faster to work with.  &lt;/P&gt;&lt;P&gt;Speaking of speed, I haven't heard of anyone moving data from HAWQ to Hive.  It is always the other way round!  HAWQ is so much faster and has better SQL support than Hive.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 22:00:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131440#M39422</guid>
      <dc:creator>jroberts</dc:creator>
      <dc:date>2016-09-01T22:00:36Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131441#M39423</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11761/jroberts.html" nodeid="11761"&gt;@Jon Roberts&lt;/A&gt;&lt;P&gt;Could you please elaborate on external table supports secured clusters?  Not sure how HAWQ handles  HDFS write to different secured hadoop cluster using external writable table.&lt;/P&gt;&lt;P&gt;Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 22:05:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131441#M39423</guid>
      <dc:creator>njayakumar</dc:creator>
      <dc:date>2016-09-01T22:05:19Z</dc:date>
    </item>
    <item>
      <title>Re: Primary key partition handling from HAWQ to hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131442#M39424</link>
      <description>&lt;P&gt;It is pretty simple to configure:&lt;/P&gt;&lt;P&gt;&lt;A href="http://hdb.docs.pivotal.io/20/pxf/ConfigurePXF.html#topic_i3f_hvm_ss" target="_blank"&gt;http://hdb.docs.pivotal.io/20/pxf/ConfigurePXF.html#topic_i3f_hvm_ss&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 01 Sep 2016 22:32:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Primary-key-partition-handling-from-HAWQ-to-hive/m-p/131442#M39424</guid>
      <dc:creator>jroberts</dc:creator>
      <dc:date>2016-09-01T22:32:53Z</dc:date>
    </item>
  </channel>
</rss>

