Created on 05-25-2016 03:21 AM - edited 09-16-2022 03:21 AM
Hello,
I am porting an app from SQLServer to Impala. Microsoft say that when they compare strings they follow the ANSI/ISO SQL-92 specification (Section 8.2, <Comparison Predicate>, General rules #3). (see https://support.microsoft.com/en-us/kb/316626)
As far as I can tell Impala does not do this.
This simple script illustrates this
CREATE TABLE test (C1 CHAR(10));
insert into test (C1) VALUES ('ABC '); --note the trailing spaces here
select * from test where c1 = 'ABC'; --note I am not specifying them here
drop table test;
When run on SQLServer, I get the row back. Impala does not return it.
Is there a way to fix this?
Created 05-25-2016 11:40 PM
You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.
It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.
To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.
So in your example, you'd have to change your query to:
select * from test where trim(c1) = 'ABC';
Created 05-25-2016 10:49 AM
Thanks for your report. This is a known issue:
https://issues.cloudera.org/browse/IMPALA-1652
The simplest workaround is to use the STRING data type instead.
Created 05-25-2016 11:27 PM
Hi,
I tried following your suggestion. This is my script
CREATE TABLE test (C1 STRING);
insert into test (C1) VALUES ('ABC '); --note the trailing spaces here
select * from test where c1 = 'ABC'; --note I am not specifying them here
drop table test;
This also doesnt work. What am I doing wrong?
Created 05-25-2016 11:40 PM
You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.
It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.
To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.
So in your example, you'd have to change your query to:
select * from test where trim(c1) = 'ABC';
Created 05-26-2016 12:02 AM
Ok, that makes sense now. This solution does also work with CHAR type guess. The other alternative is to pad out the literal, which in my particular case is easier.
Thanks for the prompy reply.
Created 05-26-2016 12:10 AM
You're welcome. Sorry for the inconvenience.