Support Questions
Find answers, ask questions, and share your expertise

How to deal with duplicate values while importing data from RDBMS to HDFS?

Explorer

I am using ambari.

sqoop import --connect "jdbc:sqlserver://localhost:1433;database=db_name;username=user_name;pasword= password" --table table_name --target-dir /Sqoop/output --append --incremental append --check-column ID --last-value 100

while executing this command, it loading newly added rows in the existing directory but it shows duplicated records also.

how to avoid these duplicated records using sqoop?

1 ACCEPTED SOLUTION
1 REPLY 1
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.