Support Questions

Find answers, ask questions, and share your expertise

Impala Query Filter with trailing spaces - not following ANSI standard

avatar
Explorer

Hello,

 

I am porting an app from SQLServer to Impala. Microsoft say that when they compare strings they follow the ANSI/ISO SQL-92 specification (Section 8.2, <Comparison Predicate>, General rules #3). (see https://support.microsoft.com/en-us/kb/316626)

 

As far as I can tell Impala does not do this.

 

This simple script illustrates this

 

CREATE TABLE test (C1 CHAR(10));
insert into test (C1) VALUES ('ABC       '); --note the trailing spaces here
select * from test where c1 = 'ABC';          --note I am not specifying them here
drop table test;

 

When run on SQLServer, I get the row back. Impala does not return it.

 

Is there a way to fix this?

1 ACCEPTED SOLUTION

avatar

You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.

 

It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.

 

To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.

 

So in your example, you'd have to change your query to:

 

select * from test where trim(c1) = 'ABC';     

View solution in original post

5 REPLIES 5

avatar

Thanks for your report. This is a known issue:

https://issues.cloudera.org/browse/IMPALA-1652

 

The simplest workaround is to use the STRING data type instead.

avatar
Explorer

Hi,

 

I tried following your suggestion. This is my script

 

CREATE TABLE test (C1 STRING);
insert into test (C1) VALUES ('ABC       '); --note the trailing spaces here
select * from test where c1 = 'ABC';          --note I am not specifying them here
drop table test;

 

This also doesnt work. What am I doing wrong?

avatar

You are not doing anything wrong. I should have been more clear in my description of the workaround, sorry.

 

It seems like what you want is CHAR comparison semantics, but Impala currently does not provide that directly.

 

To avoid confusion, I'd recommend using the STRING type consistently. However, comparison of STRINGs are whitespace-sensitive, so you will have to work around that, e.g., by manually trimming the whitespace when doing comparisons.

 

So in your example, you'd have to change your query to:

 

select * from test where trim(c1) = 'ABC';     

avatar
Explorer

Ok, that makes sense now. This solution does also work with CHAR type guess. The other alternative is to pad out the literal, which in my particular case is easier.

 

Thanks for the prompy reply.

avatar

You're welcome. Sorry for the inconvenience.