Support Questions

Find answers, ask questions, and share your expertise

Sqoop import - Special characters

avatar
Contributor

All,

Working on importing data from DB2 using sqoop import, it worked fine for the most part except one table, which seemed to have some special characters ( control-M = ^M ) in contents, hence while sqooping, these characters are treated as newline and hence everything after it will be on the next line in the imported files, which will affect all the records after one bad record.

I am unable to guess how to fix the imports? is there any eazy way?

1 ACCEPTED SOLUTION

avatar
Super Guru
@Abhijeet Rajput

Sqoop should load data in UTF-8 by default. run the following

get db cfg for db_name

and see the value for Database_code_set. In your mapred-site.xml, can you please try adding the following for mapreduce.map.java.opts:

-Ddb2.jcc.charsetDecoderEncoder=3

View solution in original post

1 REPLY 1

avatar
Super Guru
@Abhijeet Rajput

Sqoop should load data in UTF-8 by default. run the following

get db cfg for db_name

and see the value for Database_code_set. In your mapred-site.xml, can you please try adding the following for mapreduce.map.java.opts:

-Ddb2.jcc.charsetDecoderEncoder=3