Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

sqoop not accepting extended ascii as delimiters

avatar
Explorer

I am trying to import data from teradata using the below command.

 

sqoop import --connect jdbc:teradata://xxxxx/DATABASE=xxxxx 
--username xxxx -P --query "SELECT TOP 1000 col1,col2,...col10 FROM table
where 1=1 AND \$CONDITIONS" --fields-terminated-by 'Ç'
--target-dir /path -m1

 Currently we tried import using  (Dec-128) Ç (cedilla), but its populationg in hdfs files as some junk Ã

 

Current sample data:

24182Ã27-1746296ÃCRYSTAL DAWN MARTIN DCÃVENÃnullÃnullÃnullÃ50Ã10Ã2016-04-18 08:56:19.231Ã2016-04-18 08:56:19.231Ã2016-06-07 23:14:11.711Ã0ÃAÃ>,���������*�$�!�Ã�2�����e`��B)y���/O

 

The problem is since the data has much junk it almost have all the ascii characters in data. Also i'm unable to escape delimiters in the data using--hive-drop-import-delims

 cos it's not supported in Teradata and throws error as 

 

17/09/20 07:19:36 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-drop-import-delims

 

 

Now because of this, data count is not matching and is always more than source cos the delimiters inside data is not being escaped.

 

Any suggestions ??

Who agreed with this topic