Options
- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Solved
Go to solution
How to deal with duplicate values while importing data from RDBMS to HDFS?
Labels:
- Labels:
-
Apache Hadoop
-
Apache Sqoop
Contributor
Created 06-02-2017 06:38 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using ambari.
sqoop import --connect "jdbc:sqlserver://localhost:1433;database=db_name;username=user_name;pasword= password" --table table_name --target-dir /Sqoop/output --append --incremental append --check-column ID --last-value 100
while executing this command, it loading newly added rows in the existing directory but it shows duplicated records also.
how to avoid these duplicated records using sqoop?
1 ACCEPTED SOLUTION
Guru
Created 06-02-2017 08:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1 REPLY 1
Guru
Created 06-02-2017 08:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
