Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Duplicate data in incremental load from sql to hive

Duplicate data in incremental load from sql to hive

Explorer

Hi all,

When i tried to do incremental load from sql to hive i'm getting duplicate data. I have only 2 rows in Sql, if i run this code again i got another 2 rows in Hive i.e,. duplicate occurred.

Please provide me solutions.

sqoop import --connect 'jdbc:sqlserver://XXX.XXX;database=PTWTarget' --username un --password Admin --table ActionItems --incremental lastmodified --check-column Lastupdateddate --merge-key ActionId --num-mappers 1 --hive-import --hive-table project_tracking.actionitems2

2 REPLIES 2
Highlighted

Re: Duplicate data in incremental load from sql to hive

New Contributor

Hi @Nethaji R,

In your sqoop command, --last-value attribute is missing. Use that attribute so you will not have duplicate data in your hive table. For better use of incremental query you can use sqoop job so everytime you don't have to remember last-value.

Here is the code what I used to solve your question.

sqoop import \
--connect 'connectionUrl' \
--username 'username' \
--password 'password' \
--table 'tableName' \
--incremental lastmodified \
--check-column 'columnName' \
--last-value "columnValue" \
--merge-key "mergeKeyColumnName" \
--hive-import \
--hive-table "hive-database.hive-table"

Let me know if it's not clear.

Reference:

incremental_imports

sqoop jobs

Highlighted

Re: Duplicate data in incremental load from sql to hive

Explorer

Hi

I find the solution, now there is no duplicate data occurred.

Thank you

sqoop import --connect 'jdbc:sqlserver://XX.XXX;database=XXXX' --username XX --password XXXX --table ActionItems --split-by ActionId --target-dir /user/hive/warehouse/project_tracking.db/actionitems --incremental append --check-column Lastupdateddate --last-value "2019-02-06 19:52:55.873"
Don't have an account?
Coming from Hortonworks? Activate your account here