09-20-2017 04:23 AM
I am trying to import data from teradata using the below command.
sqoop import --connect jdbc:teradata://xxxxx/DATABASE=xxxxx
--username xxxx -P --query "SELECT TOP 1000 col1,col2,...col10 FROM table
where 1=1 AND \$CONDITIONS" --fields-terminated-by 'Ç'
--target-dir /path -m1
Currently we tried import using (Dec-128) Ç (cedilla), but its populationg in hdfs files as some junk Ã
Current sample data:
24182Ã27-1746296ÃCRYSTAL DAWN MARTIN DCÃVENÃnullÃnullÃnullÃ50Ã10Ã2016-04-18 08:56:19.231Ã2016-04-18 08:56:19.231Ã2016-06-07 23:14:11.711Ã0ÃAÃ>,ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½*ï¿½$ï¿½!ï¿½Ãï¿½2ï¿½ï¿½ï¿½ï¿½ï¿½e`ï¿½ï¿½B)yï¿½ï¿½ï¿½/O
The problem is since the data has much junk it almost have all the ascii characters in data. Also i'm unable to escape delimiters in the data using--hive-drop-import-delims
cos it's not supported in Teradata and throws error as
17/09/20 07:19:36 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-drop-import-delims
Now because of this, data count is not matching and is always more than source cos the delimiters inside data is not being escaped.
Any suggestions ??