Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎07-25-2017

Impala JDBC - bug the length of chinese charactor string is not correct

We use impala 2.9 and kudu 1.4 , when we use impalajdbc41 insert chinese charactor string , such as “汉字字符测试” , only 1/3 charactor were stored。 We found that when we use an insert sentence such as “insert into table1(name) values('汉字字符测试')”, it becames "insert into table1(name) values(cast('汉字字符测试') as char(6))". The length of '汉字字符测试' is 6, and it is not correct for chinese charactor string, the length of chinese charactor string should depends on its encoding。We shoulde use “汉字字符测试”.getBytes("UTF-8").length to get the length of “汉字字符测试” when we use “utf-8” as the encoding, and it seems that in the impalajdbc41 , “汉字字符测试”.length() was used to get the length of “汉字字符测试”