Created 09-26-2024 06:25 AM
Hi folks,
I tried to submit a query with params via ODBC to Impala. Thi is my code:
import pyodbc
# connect
crsr = pyodbc.connect('DSN=impala', autocommit=True).cursor()
param1 = 1000000000
param2 = '2023-01-01'
# long query
query = """
select *
from target_test.test d
left outer join
(select cli, fsc, nom_rag_soc, dat_ini_val, dat_ccz_inl, dat_csz_efv, dta_cessata, sta, prg_var, cod_cau_csz_dl, rn
from
(select
c2.cod_cli as cli,
trim(c2.cod_fsc) as fsc,
trim(c2.nom_rag_soc) as nom_rag_soc,
c2.dat_ini_val,
dat_ccz_inl,
dat_csz_efv,
(case when c2.dat_csz_efv= '1900-01-01 00:00:00' then '2999-01-01 00:00:00' else c2.dat_csz_efv end) as dta_cessata,
sta,
prg_var,
cod_cau_csz_dl,
row_number() over (partition by cod_fsc order by (case when c2.dat_csz_efv='1900-01-01 00:00:00' then '2999-01-01 00:00:00' else c2.dat_csz_efv end) desc) as rn
from target_test.test2 c2
where
c2.sta='c'
and c2.dat_ini_val = (select max(c1.dat_ini_val) from target_test.test3 c1 where c1.sta='c' and c2.cod_cli=c1.cod_cli and c2.cod_fsc=c1.cod_fsc)
and prg_var = (select max(prg_var) from target_test.v_padln_t_cliente xc where c2.cod_cli = xc.cod_cli and sta ='c' and xc.dat_ini_val = c2.dat_ini_val)
) a
where rn=1
) k
on trim(d.cod_fsc)=k.fsc
where
d.prg_rig < ?
and d.dat_cre_flu >= ?
and d.tip_flu = 'f24';
"""
try:
res = crsr.execute(query, param1, param2).fetchall
print(res)
except pyodbc.Error as e:
e.args = (e.args[1],)
raise
But this is the output:
I installed this driver: ClouderaImpalaODBC-2.7.1.1009-1.x86_64.rpm
If I try whit simple query, without subqueries, it works.
Any suggestions?
Created 10-06-2024 12:35 PM
Hi-
If you are using the latest ODBC and seeing this issue, then this could be hitting an ODBC driver 'corner case' bug. Please open a ticket with Cloudera Support, and we will take it further to the Driver Team.
Created 09-26-2024 07:10 AM
@disoardi, Welcome to our community! To help you get the best possible answer, I have tagged our Impala experts @ChethanYM @SVB @mszurap who may be able to assist you further.
Please feel free to provide any additional information or details about your query, and we hope that you will find a satisfactory solution to your question.
Regards,
Vidya Sargur,Created on 09-30-2024 12:02 AM - edited 09-30-2024 06:09 AM
Hi @VidyaSargur and thanks for your suggestion.
@ChethanYM, @SVB or @mszurap any ideas about this issue?
Thanks in advance.
Created 09-30-2024 04:51 PM
@disoardi Parametric queries do only work if the "UseNativeQuery" option is set to 0, this is the default (but you might have it set to 1 in the DSN configuration).
Yo could try connecting with:
crsr = pyodbc.connect('DSN=impala;UseNativeQuery=0', autocommit=True).cursor()
See page 84 of the "Cloudera ODBC Connector for Apache Impala Installation and Configuration Guide" for the full description of this option.
Created 10-01-2024 01:49 AM
I tried but the problem remains the same.
Thanks for the tip
Created 10-01-2024 03:29 AM
The error message shows that Impala gets the query with question marks in it, which is not good, as Impala itself doesn't supports prepared statements or query parameters. All of this functionality should be done in the ODBC driver.
You've written that a simple query without subquery works. Does the simple query works with or without parameter substitution?
Since the whole prepared statement/query substitution is don by the ODBC driver, and not by Impala, you would get no performance gains from using it. So I believe this only useful if you are porting some existing code/queries to use Impala.
You can just use a python f-string or the .format() function to do the parameter substitution by yourself in your code, it won't hurt performance.
Created 10-02-2024 12:33 AM
The simple query, without subqueries, with params works.
For example:
SELECT * FROM target_test.test2 where id = ?;
About the support of params in the ODBC driver for impala I found this: https://community.cloudera.com/t5/Support-Questions/Unable-to-pass-parameter-to-Impala-from-SSRS/m-p...
Quickly the ODBC driver for impala supports the standard SQLBindParameter() API to subsitute ? in the query text with parameters.
but this seems not to work for complex queries.
Created 10-02-2024 02:43 AM
I would try if replacing the sub-queries with 'WITH' statements would help. Maybe the query is just too complex for this query-rewrite/parameter substitution engine n the ODBC driver.
If that not helps, there are some logging options for the driver, I would use those to see if they give any useful information what is happening inside the driver.
Created on 10-03-2024 01:35 AM - edited 10-03-2024 01:43 AM
I tried but not works. The complet test below:
import pyodbc
# connect
crsr = pyodbc.connect('DSN=impala1;', autocommit=True).cursor()
# unnamed parameters
params = ('C', 1000000000, '2023-01-01')
# short query
query = """SELECT *
FROM test1.test1 d
INNER JOIN
(SELECT *
FROM test1.test2 c1
WHERE dat_ini_val = (SELECT max(c2.dat_ini_val)
FROM test1.test2 c2
WHERE c2.sta = ?
AND c2.cod_cli=c1.cod_cli
AND c1.cod_fsc=c2.cod_fsc)) c
ON d.cod_cli = c.cod_cli
WHERE cast(D.cod_fsc as string) <> ''
AND d.prg_rig < ? AND d.dat_cre_flu >= ? AND d.TIP_FLU = 'F24';"""
try:
res = crsr.execute(query, params).fetchall() #<-- this works
print('Query simple DONE!!!')
except pyodbc.Error as e:
e.args = (e.args[1],)
print(e)
# long query
query = """SELECT *
FROM test1.test1 d
LEFT OUTER JOIN
(SELECT cli, fsc, nom_rag_soc, dat_ini_val, dat_ccz_inl, Dat_csz_efv, sta, prg_var, COD_CAU_CSZ_DL
FROM
(SELECT c2.cod_cli AS cli,
TRIM(CAST(c2.cod_fsc AS STRING)) AS fsc,
TRIM(CAST(c2.nom_rag_soc AS STRING)) AS nom_rag_soc,
c2.dat_ini_val,
c2.dat_ccz_inl,
c2.Dat_csz_efv,
c2.sta,
c2.prg_var,
c2.COD_CAU_CSZ_DL
,(CASE WHEN c2.Dat_csz_efv = '1900-01-01 00:00:00' THEN '2999-01-01' ELSE c2.Dat_csz_efv END) AS dta_cessata
,ROW_NUMBER () OVER (PARTITION BY cod_fsc ORDER BY (case when c2.Dat_csz_efv='1900-01-01 00:00:00' then '2999-01-01' ELSE c2.Dat_csz_efv END) desc) rn
FROM test1.test2 c2
WHERE c2.sta = ?
) A
) k
ON TRIM(CAST(d.COD_fsc AS STRING)) = k.fsc
WHERE d.prg_rig < ?
AND d.dat_cre_flu >= ?
AND d.TIP_FLU = 'F24';"""
try:
res = crsr.execute(query, params).fetchall() #<-- this does not work
print('Query complex DONE!!!')
except pyodbc.Error as e:
e.args = (e.args[1],)
print(e)
query = """
WITH prova as (SELECT c2.cod_cli AS cli,
TRIM(CAST(c2.cod_fsc AS STRING)) AS fsc,
TRIM(CAST(c2.nom_rag_soc AS STRING)) AS nom_rag_soc,
c2.dat_ini_val,
c2.dat_ccz_inl,
c2.Dat_csz_efv,
c2.sta,
c2.prg_var,
c2.COD_CAU_CSZ_DL
,(CASE WHEN c2.Dat_csz_efv = '1900-01-01 00:00:00' THEN '2999-01-01' ELSE c2.Dat_csz_efv END) AS dta_cessata
,ROW_NUMBER () OVER (PARTITION BY cod_fsc ORDER BY (case when c2.Dat_csz_efv='1900-01-01 00:00:00' then '2999-01-01' ELSE c2.Dat_csz_efv END) desc) rn
FROM test1.test2 c2
WHERE c2.sta = 'C'
)
SELECT *
FROM test1.test1 d
LEFT OUTER JOIN
(SELECT cli, fsc, nom_rag_soc, dat_ini_val, dat_ccz_inl, Dat_csz_efv, sta, prg_var, COD_CAU_CSZ_DL
FROM prova
) k
ON TRIM(CAST(d.COD_fsc AS STRING)) = k.fsc
WHERE d.prg_rig < ?
AND d.dat_cre_flu >= ?
AND d.TIP_FLU = 'F24';
"""
params = (1000000000, '2023-01-01')
try:
res = crsr.execute(query, params).fetchall() #<-- this does not work
print('Query with DONE!!!')
except pyodbc.Error as e:
e.args = (e.args[1],)
print(e)
the simple query, with parameters, works, others do not
Created 10-06-2024 12:35 PM
Hi-
If you are using the latest ODBC and seeing this issue, then this could be hitting an ODBC driver 'corner case' bug. Please open a ticket with Cloudera Support, and we will take it further to the Driver Team.