Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is that possible to display special characters in custom TBLPROPERTIES property ?

avatar
Contributor

Hi,

I am trying to set custom properties (using "ALTER TABLE ... SET TBLPROPERTIES ..." command) that contains special characters, like 'ç' or 'é'. The problem is I am getting '\u00e7' and '\u00e9' as a result when I execute a "DESCRIBE TABLE ... FORMATTED" command.

Is there a way to get the proper encoding in return ?

Here is my command :

ALTER TABLE mydatabase.mytable SET TBLPROPERTIES ('Test'='François');

Here is what I am getting as a result with a DESCRIBE TABLE ... FORMATTED command :

Test Fran\\u00e7ois

Thanks in advance for your answer !

Sylvain.

1 ACCEPTED SOLUTION

avatar
Super Guru

@dvt isoft

Your question is how to store and retrieve encoded characters in French from table data definition, specifically table properties. Hive expects UTF-8 by default in data definition and even data store. I am not aware of the option to use that approach for data definition. Regarding data store you can encode/decode using a special SerDe as specified above by @Boris Demerov.

View solution in original post

3 REPLIES 3

avatar
Contributor

@dvt isoft

Custom SerDes are always a last resort. Hive expects UTF-8 data. If the encoding is, say, ISO/IEC 8859-1, you will need to either convert the data, however starting with Hive 0.14 you can use the feature added in https://issues.apache.org/jira/browse/HIVE-7142. I believe that for French is FR. See below an example for GBK

CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES("serialization.encoding"='GBK');

avatar
Super Guru

@dvt isoft

Your question is how to store and retrieve encoded characters in French from table data definition, specifically table properties. Hive expects UTF-8 by default in data definition and even data store. I am not aware of the option to use that approach for data definition. Regarding data store you can encode/decode using a special SerDe as specified above by @Boris Demerov.

avatar
Contributor

Thank you, it seems like TBLPROPERTIES don't like french characters ...