Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using Non Printable Characters

Using Non Printable Characters

New Contributor

I am trying to import data from a table containing several text columns. The text in these columns may contain commas, newlines or any other keyboard characters along with the regular text. Hence I do not want to use the default delimiters available via Sqoop 1.4.6 and so use the non-printable character '\0x11'. When I specify as follows, I am able to import the data into HDFS without any errors. However, the Hive table when queried via beeline does not show the values in a single column. It uses zero as the field separator even though I specify a Hexadecimal value. What am I missing?

 

sqoop import --driver com.mysql.jdbc.Driver --connection-manager org.apache.sqoop.manager.GenericJdbcManager --connect jdbc:mysql://localhost:3306/sample --username chpapp --P --table comments --target-dir /data/mysql/comments --fields-terminated-by '\0x11' --hive-delims-replacement " "

 

CREATE EXTERNAL TABLE comments (id INT,notes STRING)  ROW FORMAT DELIMITED FIELDS TERMINATED BY "\0x11"  LINES TERMINATED BY "\n" STORED AS TEXTFILE LOCATION '/data/mysql/comments';

 

Thanks in advance.

1 REPLY 1

Re: Using Non Printable Characters

Master Guru
The hex syntax may not work. Try the octal one: FIELDS TERMINATED BY "\021"

021 is the octal equivalent of 0x11:
~> python
>>> print repr("\021")
'\x11'
Don't have an account?
Coming from Hortonworks? Activate your account here