Support Questions

Find answers, ask questions, and share your expertise

Is that possible to display special characters in custom TBLPROPERTIES property ?

avatar
Contributor

Hi,

I am trying to set custom properties (using "ALTER TABLE ... SET TBLPROPERTIES ..." command) that contains special characters, like 'ç' or 'é'. The problem is I am getting '\u00e7' and '\u00e9' as a result when I execute a "DESCRIBE TABLE ... FORMATTED" command.

Is there a way to get the proper encoding in return ?

Here is my command :

ALTER TABLE mydatabase.mytable SET TBLPROPERTIES ('Test'='François');

Here is what I am getting as a result with a DESCRIBE TABLE ... FORMATTED command :

Test Fran\\u00e7ois

Thanks in advance for your answer !

Sylvain.

1 ACCEPTED SOLUTION

avatar
Super Guru

@dvt isoft

Your question is how to store and retrieve encoded characters in French from table data definition, specifically table properties. Hive expects UTF-8 by default in data definition and even data store. I am not aware of the option to use that approach for data definition. Regarding data store you can encode/decode using a special SerDe as specified above by @Boris Demerov.

View solution in original post

3 REPLIES 3

avatar
Contributor

@dvt isoft

Custom SerDes are always a last resort. Hive expects UTF-8 data. If the encoding is, say, ISO/IEC 8859-1, you will need to either convert the data, however starting with Hive 0.14 you can use the feature added in https://issues.apache.org/jira/browse/HIVE-7142. I believe that for French is FR. See below an example for GBK

CREATE TABLE person(id INT, name STRING, desc STRING)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES("serialization.encoding"='GBK');

avatar
Super Guru

@dvt isoft

Your question is how to store and retrieve encoded characters in French from table data definition, specifically table properties. Hive expects UTF-8 by default in data definition and even data store. I am not aware of the option to use that approach for data definition. Regarding data store you can encode/decode using a special SerDe as specified above by @Boris Demerov.

avatar
Contributor

Thank you, it seems like TBLPROPERTIES don't like french characters ...