<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: sqoop import to hive again stroing repeted recordes in same hive table in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152543#M44657</link>
    <description>&lt;P&gt;Dear All.&lt;/P&gt;&lt;P&gt;I have table in sql server that column contain random unique number there is no any primary key but we want to perform incremental append or lastmodified operation using sqoop so please help me.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Note:-This is Critical Issue.&lt;/P&gt;</description>
    <pubDate>Fri, 17 May 2019 23:03:20 GMT</pubDate>
    <dc:creator>HadoopHelp</dc:creator>
    <dc:date>2019-05-17T23:03:20Z</dc:date>
    <item>
      <title>sqoop import to hive again stroing repeted recordes in same hive table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152540#M44654</link>
      <description>&lt;P&gt;sqoop import --connect jdbc:mysql://locahost/test --table a2 --username root --password -m 1 --hive-import --hive-database default  --hive-table a2 --target-dir /tmp/n11  --driver com.mysql.jdbc.Driver
1. &lt;/P&gt;&lt;P&gt;The mysql table a2 contains 2 records. &lt;/P&gt;&lt;P&gt;example&lt;/P&gt;&lt;P&gt;
id    name &lt;/P&gt;&lt;P&gt;   
1     aa &lt;/P&gt;&lt;P&gt;2 bb &lt;/P&gt;&lt;P&gt;2. Initially i run the below query it create and load the 2 records to the hive &lt;/P&gt;&lt;P&gt;3.Then i run the same query again it stores the same records  like this &lt;/P&gt;&lt;P&gt;id   name &lt;/P&gt;&lt;P&gt;1    aa &lt;/P&gt;&lt;P&gt;2    bb &lt;/P&gt;&lt;P&gt;1    aa &lt;/P&gt;&lt;P&gt;2    bb&lt;/P&gt;&lt;P&gt;how avoid this duplicate records generation in hive table using sqoop please suggest me&lt;/P&gt;&lt;P&gt;please help me to solve this problem&lt;/P&gt;&lt;P&gt;thanks  in advance&lt;/P&gt;&lt;P&gt;swathi&lt;/P&gt;</description>
      <pubDate>Thu, 27 Oct 2016 17:28:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152540#M44654</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2016-10-27T17:28:54Z</dc:date>
    </item>
    <item>
      <title>Re: sqoop import to hive again stroing repeted recordes in same hive table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152541#M44655</link>
      <description>&lt;P&gt;You must have a table with a column updating the modifed date (date of insert or update).  If you have this column, then use:&lt;EM&gt; --check-column {modified_date_col} --incremental {lastmodified} --last-value {modified_date}.  &lt;/EM&gt;If you do not have this column, you cannot avoid your issue.&lt;/P&gt;</description>
      <pubDate>Fri, 28 Oct 2016 02:47:41 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152541#M44655</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-10-28T02:47:41Z</dc:date>
    </item>
    <item>
      <title>Re: sqoop import to hive again stroing repeted recordes in same hive table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152542#M44656</link>
      <description>&lt;P&gt;Thanku so much&lt;/P&gt;</description>
      <pubDate>Fri, 28 Oct 2016 11:54:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152542#M44656</guid>
      <dc:creator>Former Member</dc:creator>
      <dc:date>2016-10-28T11:54:53Z</dc:date>
    </item>
    <item>
      <title>Re: sqoop import to hive again stroing repeted recordes in same hive table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152543#M44657</link>
      <description>&lt;P&gt;Dear All.&lt;/P&gt;&lt;P&gt;I have table in sql server that column contain random unique number there is no any primary key but we want to perform incremental append or lastmodified operation using sqoop so please help me.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Note:-This is Critical Issue.&lt;/P&gt;</description>
      <pubDate>Fri, 17 May 2019 23:03:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152543#M44657</guid>
      <dc:creator>HadoopHelp</dc:creator>
      <dc:date>2019-05-17T23:03:20Z</dc:date>
    </item>
    <item>
      <title>Re: sqoop import to hive again stroing repeted recordes in same hive table</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152544#M44658</link>
      <description>&lt;P&gt;Greg's answer applies to you as well for incremental import/export operation. Also if you have some column in your source table which is an sequential index etc then you can be used for --split-by clause for distribution of data per mapper to scale parallelism and reduce runtime of app. &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;My understanding is random numbers in a column if used for split key , can cause skew as well leading to different runtimes for map tasks. &lt;/P&gt;</description>
      <pubDate>Sat, 18 May 2019 04:12:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/sqoop-import-to-hive-again-stroing-repeted-recordes-in-same/m-p/152544#M44658</guid>
      <dc:creator>kushalbohra</dc:creator>
      <dc:date>2019-05-18T04:12:54Z</dc:date>
    </item>
  </channel>
</rss>

