Support Questions

priyal · ‎06-02-2017

I am using ambari.

sqoop import --connect "jdbc:sqlserver://localhost:1433;database=db_name;username=user_name;pasword= password" --table table_name --target-dir /Sqoop/output --append --incremental append --check-column ID --last-value 100

while executing this command, it loading newly added rows in the existing directory but it shows duplicated records also.

how to avoid these duplicated records using sqoop?

namaheshwari · ‎06-02-2017

Below posts might help:

https://community.hortonworks.com/questions/63761/sqoop-import-to-hive-again-stroing-repeted-recorde...

https://community.hortonworks.com/questions/51508/sqoop-imported-more-records-than-source.html

View solution in original post

namaheshwari · ‎06-02-2017

Below posts might help:

https://community.hortonworks.com/questions/63761/sqoop-import-to-hive-again-stroing-repeted-recorde...

https://community.hortonworks.com/questions/51508/sqoop-imported-more-records-than-source.html

Cloudera Community

Support Questions

How to deal with duplicate values while importing data from RDBMS to HDFS?

Getting duplicate data when importing data to HDFS...

RDBMS to Hive using NiFi (small-medium tables)

Duplicate Directories in HDFS

Duplicate results using extract text processor for...

duplicate directories in hdfs location

Not getting zip option in cloudera sqoop2 while im...

Sqoop import - null values in HDFS files replaced ...

Import HBase data in csv format using pig

Import data from HDFS to MongoDB

Importing Timestamp values returning nulls