Reply
Highlighted
Explorer
Posts: 6
Registered: ‎09-20-2017

sqoop not accepting extended ascii as delimiters

I am trying to import data from teradata using the below command.

 

sqoop import --connect jdbc:teradata://xxxxx/DATABASE=xxxxx 
--username xxxx -P --query "SELECT TOP 1000 col1,col2,...col10 FROM table
where 1=1 AND \$CONDITIONS" --fields-terminated-by 'Ç'
--target-dir /path -m1

 Currently we tried import using  (Dec-128) Ç (cedilla), but its populationg in hdfs files as some junk Ã

 

Current sample data:

24182Ã27-1746296ÃCRYSTAL DAWN MARTIN DCÃVENÃnullÃnullÃnullÃ50Ã10Ã2016-04-18 08:56:19.231Ã2016-04-18 08:56:19.231Ã2016-06-07 23:14:11.711Ã0ÃAÃ>,���������*�$�!�Ã�2�����e`��B)y���/O

 

The problem is since the data has much junk it almost have all the ascii characters in data. Also i'm unable to escape delimiters in the data using--hive-drop-import-delims

 cos it's not supported in Teradata and throws error as 

 

17/09/20 07:19:36 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-drop-import-delims

 

 

Now because of this, data count is not matching and is always more than source cos the delimiters inside data is not being escaped.

 

Any suggestions ??

Announcements