<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to avoid duplicate row insertion in Hive? in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-avoid-duplicate-row-insertion-in-Hive/m-p/286153#M212248</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/33732"&gt;@Prakashcit&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That by design:&lt;/P&gt;&lt;P&gt;A NOVALIDATE constraint is basically a constraint that can be enabled but for which hive will not check the existing data to determine whether there might be data that currently violate the constraint.&lt;/P&gt;&lt;P&gt;This is useful if we know there’s data that violates the constraint but we want to quickly put on a constraint to prevent further violations, with the intention to clean up any possible violations at some future point in time.&lt;/P&gt;&lt;P&gt;It’s also potentially useful if we know the data is clean and so want to prevent the potentially significant overheads of hive having to check all the data to ensure there are indeed no violations.&lt;/P&gt;</description>
    <pubDate>Sun, 22 Dec 2019 18:17:14 GMT</pubDate>
    <dc:creator>Shelton</dc:creator>
    <dc:date>2019-12-22T18:17:14Z</dc:date>
  </channel>
</rss>

