Posts: 6
Registered: ‎09-20-2017

sqoop not accepting extended ascii as delimiters

I am trying to import data from teradata using the below command.


sqoop import --connect jdbc:teradata://xxxxx/DATABASE=xxxxx 
--username xxxx -P --query "SELECT TOP 1000 col1,col2,...col10 FROM table
where 1=1 AND \$CONDITIONS" --fields-terminated-by 'Ç'
--target-dir /path -m1

 Currently we tried import using  (Dec-128) Ç (cedilla), but its populationg in hdfs files as some junk Ã


Current sample data:

24182Ã27-1746296ÃCRYSTAL DAWN MARTIN DCÃVENÃnullÃnullÃnullÃ50Ã10Ã2016-04-18 08:56:19.231Ã2016-04-18 08:56:19.231Ã2016-06-07 23:14:11.711Ã0ÃAÃ>,���������*�$�!�Ã�2�����e`��B)y���/O


The problem is since the data has much junk it almost have all the ascii characters in data. Also i'm unable to escape delimiters in the data using--hive-drop-import-delims

 cos it's not supported in Teradata and throws error as 


17/09/20 07:19:36 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-drop-import-delims



Now because of this, data count is not matching and is always more than source cos the delimiters inside data is not being escaped.


Any suggestions ??