Welcome to the Cloudera Community

tkrish03 · ‎09-20-2017

I am trying to import data from teradata using the below command.

sqoop import --connect jdbc:teradata://xxxxx/DATABASE=xxxxx 
--username xxxx -P --query "SELECT TOP 1000 col1,col2,...col10 FROM table 
 where 1=1 AND \$CONDITIONS" --fields-terminated-by 'Ç' 
--target-dir /path -m1

Currently we tried import using (Dec-128) Ç (cedilla), but its populationg in hdfs files as some junk Ã

Current sample data:

24182Ã27-1746296ÃCRYSTAL DAWN MARTIN DCÃVENÃnullÃnullÃnullÃ50Ã10Ã2016-04-18 08:56:19.231Ã2016-04-18 08:56:19.231Ã2016-06-07 23:14:11.711Ã0ÃAÃ>,ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½*ï¿½$ï¿½!ï¿½Ãï¿½2ï¿½ï¿½ï¿½ï¿½ï¿½e`ï¿½ï¿½B)yï¿½ï¿½ï¿½/O

The problem is since the data has much junk it almost have all the ascii characters in data. Also i'm unable to escape delimiters in the data using--hive-drop-import-delims

cos it's not supported in Teradata and throws error as

17/09/20 07:19:36 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-drop-import-delims

Now because of this, data count is not matching and is always more than source cos the delimiters inside data is not being escaped.

Any suggestions ??

Cloudera Community

Welcome to the Cloudera Community

Who agreed with this topic

sqoop not accepting extended ascii as delimiters