Reply
New Contributor
Posts: 1
Registered: ‎02-03-2016

Using Non Printable Characters

I am trying to import data from a table containing several text columns. The text in these columns may contain commas, newlines or any other keyboard characters along with the regular text. Hence I do not want to use the default delimiters available via Sqoop 1.4.6 and so use the non-printable character '\0x11'. When I specify as follows, I am able to import the data into HDFS without any errors. However, the Hive table when queried via beeline does not show the values in a single column. It uses zero as the field separator even though I specify a Hexadecimal value. What am I missing?

 

sqoop import --driver com.mysql.jdbc.Driver --connection-manager org.apache.sqoop.manager.GenericJdbcManager --connect jdbc:mysql://localhost:3306/sample --username chpapp --P --table comments --target-dir /data/mysql/comments --fields-terminated-by '\0x11' --hive-delims-replacement " "

 

CREATE EXTERNAL TABLE comments (id INT,notes STRING)  ROW FORMAT DELIMITED FIELDS TERMINATED BY "\0x11"  LINES TERMINATED BY "\n" STORED AS TEXTFILE LOCATION '/data/mysql/comments';

 

Thanks in advance.

Posts: 1,880
Kudos: 422
Solutions: 297
Registered: ‎07-31-2013

Re: Using Non Printable Characters

The hex syntax may not work. Try the octal one: FIELDS TERMINATED BY "\021"

021 is the octal equivalent of 0x11:
~> python
>>> print repr("\021")
'\x11'
Announcements
New solutions