- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Impala Query Filter with trailing spaces - not following ANSI standard
- Labels:
-
Apache Impala
Created on ‎05-25-2016 03:21 AM - edited ‎09-16-2022 03:21 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am porting an app from SQLServer to Impala. Microsoft say that when they compare strings they follow the ANSI/ISO SQL-92 specification (Section 8.2, <Comparison Predicate>, General rules #3). (see https://support.microsoft.com/en-us/kb/316626)
As far as I can tell Impala does not do this.
This simple script illustrates this
CREATE TABLE test (C1 CHAR(10));
insert into test (C1) VALUES ('ABC '); --note the trailing spaces here
select * from test where c1 = 'ABC'; --note I am not specifying them here
drop table test;
When run on SQLServer, I get the row back. Impala does not return it.
Is there a way to fix this?
Created ‎05-25-2016 11:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.
It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.
To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.
So in your example, you'd have to change your query to:
select * from test where trim(c1) = 'ABC';
Created ‎05-25-2016 10:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your report. This is a known issue:
https://issues.cloudera.org/browse/IMPALA-1652
The simplest workaround is to use the STRING data type instead.
Created ‎05-25-2016 11:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I tried following your suggestion. This is my script
CREATE TABLE test (C1 STRING);
insert into test (C1) VALUES ('ABC '); --note the trailing spaces here
select * from test where c1 = 'ABC'; --note I am not specifying them here
drop table test;
This also doesnt work. What am I doing wrong?
Created ‎05-25-2016 11:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.
It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.
To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.
So in your example, you'd have to change your query to:
select * from test where trim(c1) = 'ABC';
Created ‎05-26-2016 12:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, that makes sense now. This solution does also work with CHAR type guess. The other alternative is to pad out the literal, which in my particular case is easier.
Thanks for the prompy reply.
Created ‎05-26-2016 12:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You're welcome. Sorry for the inconvenience.