Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala JDBC - bug the length of chinese charactor string is not correct

Highlighted

Impala JDBC - bug the length of chinese charactor string is not correct

New Contributor
We use impala 2.9 and kudu 1.4 , when we use impalajdbc41 insert chinese charactor string , such as “汉字字符测试” , only 1/3 charactor were stored。 We found that when we use an insert sentence such as “insert into table1(name) values('汉字字符测试')”, it becames "insert into table1(name) values(cast('汉字字符测试') as char(6))". The length of '汉字字符测试' is 6, and it is not correct for chinese charactor string, the length of chinese charactor string should depends on its encoding。We shoulde use “汉字字符测试”.getBytes("UTF-8").length to get the length of “汉字字符测试” when we use “utf-8” as the encoding, and it seems that in the impalajdbc41 , “汉字字符测试”.length() was used to get the length of “汉字字符测试”