Member since
04-17-2023
5
Posts
0
Kudos Received
0
Solutions
04-20-2023
07:16 AM
I now see I didn't see this reply as I was busy troubleshooting and replying about the advanced option check mark.. Sorry 😅 I was not aware that NVL functions were available in Impala, thanks for the proposal. In my case I could also reverse the CASE WHEN condition (to check for IS NULL instead of IS NOT NULL, which also generated the correct result. Nevertheless, although for me the impact is quite low, I'm not sure this is the case for all possible usages, and that's why I wanted to make the issue known. Thanks for picking it up, and hopefully it get's solved in a coming update! Thanks for your assistance, Jeremy
... View more
04-19-2023
08:14 AM
And another update from my side: In the advanced settings I can select "Use Native Query". With this option "on", the results are as expected. For now that will be my workaround, but I expect I am "loosing" on some query optimizations.. I'll gladly help if more info is needed!
... View more
04-19-2023
06:54 AM
Sorry to post two messages so close to one-another, but I am not allowed to open a case. @mszurap, can you tell me what you would advise? I have checked the log myself, and have found that the query is converted in the following way: (CASE WHEN col2 IS NOT NULL THEN col2 ELSE 0 END) AS col2_1 becomes: COALESCE(`jk_test`.`col2`) AS `col2_1` but it should have become: COALESCE(`jk_test`.`col2`, 0) AS `col2_1` I don't see any way to attach a file to this thread, but I will paste the important snippets here: line 2636:
Apr 19 15:28:46.485 INFO 15044 Simba::ODBC::StatementState::InternalPrepare: Preparing query: SELECT col1, col2 as col2_orig, (CASE WHEN col2 IS NOT NULL THEN col2 ELSE 0 END) AS col2_1, (CASE WHEN col2 IS NULL THEN 0 ELSE col2 END) AS col2_2 FROM sandbox.jk_test
line 2767:
Apr 19 15:28:47.016 TRACE 15044 Simba::ImpalaODBC::ImpalaSQLGenerator::GenerateAEValueExpr: Generated SQL: COALESCE(`jk_test`.`col2`) AS `col2_1`
line 2849:
Apr 19 15:28:47.016 TRACE 15044 Simba::ImpalaODBC::ImpalaSQLGenerator::GenerateOneStatement: Generated SQL: SELECT `jk_test`.`col1`, `jk_test`.`col2` AS `col2_orig` , COALESCE(`jk_test`.`col2`) AS `col2_1` , (CASE WHEN (`jk_test`.`col2` IS NULL) THEN 0 ELSE `jk_test`.`col2` END) AS `col2_2` FROM `sandbox`.`jk_test` `jk_test`
line 2858-2861:
Apr 19 15:28:47.016 DEBUG 15044 Simba::ImpalaODBC::ImpalaTCLIServiceThreadSafeClient::ExecuteStatement: TExecuteStatementReq
sessionHandle.sessionId.guid =
statement = SELECT `jk_test`.`col1`, `jk_test`.`col2` AS `col2_orig` , COALESCE(`jk_test`.`col2`) AS `col2_1` , (CASE WHEN (`jk_test`.`col2` IS NULL) THEN 0 ELSE `jk_test`.`col2` END) AS `col2_2` FROM `sandbox`.`jk_test` `jk_test`
runAsync = true Of course, I'm guessing other lines might be important for debugging, I am confident that there is an issue in the driver when generating the query, probably to optimize it... Because when using the CTE version, no query is generated whatsoever, and the raw one is executed: lines 3587-3590:
Apr 19 15:28:48.365 DEBUG 15044 Simba::ImpalaODBC::ImpalaTCLIServiceThreadSafeClient::ExecuteStatement: TExecuteStatementReq
sessionHandle.sessionId.guid =
statement = WITH wrapped_table AS (SELECT col1, col2 as col2_orig, (CASE WHEN col2 IS NOT NULL THEN col2 ELSE 0 END) AS col2_1, (CASE WHEN col2 IS NULL THEN 0 ELSE col2 END) AS col2_2 FROM gda_sandbox.jk_test) SELECT * FROM wrapped_table
runAsync = true Please let me know if I can do anything else, or if there is any other way for a non-paying user to submit bug-reports. Thanks!
... View more
04-19-2023
06:24 AM
@mszurap Thanks for your reply, I forgot to update the original post, but I have already tried updating to 2.6.17. I will open a support case, and paste the trace as requested.
... View more
04-17-2023
01:33 AM
I ran into an issue last week, which seems to be an issue related to the Cloudera ODBC Driver for Impala. First I thought it was an issue with pyodbc, but after submitting the issue on their github we figured out that it was a driver issue. For reference: pyodbc github issue 1197 Simply put, executing query1 and query2 should produce the same results, but it doesn't: query1 = """SELECT
col1, col2 as col2_orig,
(CASE WHEN col2 IS NOT NULL THEN col2
ELSE 0 END) AS col2_1,
(CASE WHEN col2 IS NULL THEN 0
ELSE col2 END) AS col2_2
FROM sandbox.jk_test"""
query2 = f"""WITH wrapped_table AS ({query1}) SELECT * FROM wrapped_table""" It results in two different sets: ['col1', 'col2_orig', 'col2_1', 'col2_2']
['1 ', '1 ', '1 ', '1 ']
['2 ', 'None ', 'None ', '0 ']
['col1', 'col2_orig', 'col2_1', 'col2_2']
['1 ', '1 ', '1 ', '1 ']
['2 ', 'None ', '0 ', '0 '] The 2nd result is the expected one. * See the github issue for the code used to run these queries. Running the queries with CData's driver solves the issue, as does connecting using a JDBC connection. My Environment: Python: Python 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)] OS: Windows 10 Enterprise 20H2 64-bit driver: Cloudera ODBC Driver for Impala 2.06.16.1022 Windows Script Host Version 5.812 (for vbs script)
... View more
Labels:
- Labels:
-
Apache Impala