Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How to deal with duplicate values while importing data from RDBMS to HDFS?

avatar
New Member

I am using ambari.

sqoop import --connect "jdbc:sqlserver://localhost:1433;database=db_name;username=user_name;pasword= password" --table table_name --target-dir /Sqoop/output --append --incremental append --check-column ID --last-value 100

while executing this command, it loading newly added rows in the existing directory but it shows duplicated records also.

how to avoid these duplicated records using sqoop?

1 ACCEPTED SOLUTION
1 REPLY 1