Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop create dublice values importing from MSSQL

Sqoop create dublice values importing from MSSQL

New Contributor

Hi, I am noticing an issue with a Sqoop import from MSSQL table, in an Oozie workflow.

We have a prod cluster Ambari  HDP-2.6.3.0  , Oozie 4.2.0. and Sqoop 1.4.6.


I am importing with the following comment:

import --connect jdbc:sqlserver://${SQL_SERVER_IP_ADDRESS};databaseName=${SQL_SERVER_DB_NAME_USERS_TABLE} --username ${SQL_SERVER_DB_USERNAME} --password ${SQL_SERVER_DB_PASSWORD} --table users  --hive-import --hive-overwrite --hive-table users --hive-database customers --null-string \\N --null-non-string \\N --hive-delims-replacement '\0D' --fetch-size 50000

 
And everyone's in a while the sqoop during the import will create duplicate rows (exactly duplicates). It actually imports every row 2 times (even though the table has a private key unique intensifier). Is this a knowing issue? Cloud it is a network issue that causes sqoop misbehavior?

2 REPLIES 2

Re: Sqoop create dublice values importing from MSSQL

Guru
@irene,

Have you checked if the duplicates are in the same data file or on different ones? If on different ones, are the files created together, meaning they have the same timestamp? Just want to make sure that it was not because of old data left behind.

Were there any failed attempts in MR jobs?

Cheers
Eric

Re: Sqoop create dublice values importing from MSSQL

New Contributor

The duplicates are in the same file yes.

As you can see the files have been created at the same time. 

image.png
The job report success and before that step we have a step when we delete the previous day's tables from Hive (that also report success). But also we use 

--hive-overwrite

at the command.
Capture.PNG

Don't have an account?
Coming from Hortonworks? Activate your account here