Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to deal with duplicate values while importing data from RDBMS to HDFS?

avatar
Contributor

I am using ambari.

sqoop import --connect "jdbc:sqlserver://localhost:1433;database=db_name;username=user_name;pasword= password" --table table_name --target-dir /Sqoop/output --append --incremental append --check-column ID --last-value 100

while executing this command, it loading newly added rows in the existing directory but it shows duplicated records also.

how to avoid these duplicated records using sqoop?

1 ACCEPTED SOLUTION
1 REPLY 1