Thank you for checking @Tim Armstrong! We have also found that the issue does not exist when querying Impala through the Java API, even with older connectors. However, it does, when you use another client. I checked the 2.6.4 connector and the issue still persists when running the query through SQLWorkbench. Have you tried using a differen JDBC client?
... View more
I've found that the regexp_extract and regexp_replace functions behave differently, depending on comparisons done in the same query or the database used. Consider the following script: create schema test;
create table test.a (text string);
insert into test.a values ("a"); The following query behaves inconsistently with the Impala documentation: select regexp_extract(text, '\w', 0), regexp_extract(text, '\\w', 0), text from test.a; According to the Impala documentation, double backslashes should be used as a regex escape character. However, it doesn't work here (see Col2 in the above result). Instead, it does work when using a single backslash. If we add an unrelated comparison to the query, this behaviour changes: select regexp_extract(text, '\w', 0), regexp_extract(text, '\\w', 0), text, text = "a" from test.a; Now, a double backslash is required for the regex to function correctly. The result is identical, if one uses another table column in the comparison or puts the comparison into the where-clause. This strange behavior is only present when running queries over JDBC/ODBC on a non-default database. Hue and Impala-Shell work as expected. And JDBC/ODBC-queries work as expected when executed on tables in the default database. I've tested this on CDH5.15.0 and 5.13.1 with JDBC-188.8.131.524 and ODBC v184.108.40.2064 (32bit) drivers. Is this a bug or am I missing something? Anyone else experiencing the same issue?
... View more