Member since
12-10-2015
27
Posts
7
Kudos Received
4
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1874 | 10-15-2024 09:03 AM |
10-15-2024
09:03 AM
1 Kudo
Hi @AKO , Impala has variable substitution like this: [hostname.local.net:21000] default> SET VAR:query=SELECT 1+2; Variable QUERY set to SELECT 1+2 [hostname.local.net:21000] default> ${VAR:query}; Query: SELECT 1+2 Query submitted at: 2024-10-15 15:54:29 (Coordinator: https://hostname.local.net:25000) Query progress can be monitored at: https://hostname.local.net:25000/query_plan?query_id=nnnn +-------+ | 1 + 2 | +-------+ | 3 | +-------+ Fetched 1 row(s) in 1.15s See official Impala docs at: https://impala.apache.org/docs/build/html/topics/impala_shell_running_commands.html This is a feature of impala-shell, and not impala itself, so depending on what you call "Impala Query Manager", your experience might be different. If you want a solution that is more database independent, then I recommend to use a view or a SELECT CTE (WITH statement) instead: WITH sub_query AS ( SELECT 1+2 ) SELECT * FROM sub_query;
... View more
10-15-2024
08:33 AM
1 Kudo
Hi @mrblack , how do you know that Impala performs a full table scan?
... View more
10-15-2024
07:32 AM
In your where clause: r.key=’street’ AND r.value=’abc’ AND r.key=’phone’ AND r.value=’123’ you are using the "and" operator between all the conditions. That would select a row/record where all of these conditions are true at the same time, but there are no such records. I think that's why you are getting empty results. You should use "OR" between conditions that applies to different rows, like: (r.key=’street’ AND r.value=’abc’) OR (r.key=’phone’ AND r.value=’123’)
... View more
10-15-2024
06:19 AM
@Kjarzyna wrote: Yes I saw the documentation, but i didn’t find solution there. In documentation you usually add just one map field and value into where clause Hi @Kjarzyna , If you just add one single map key or value to the where clause, does your query work?
... View more
10-02-2024
02:43 AM
I would try if replacing the sub-queries with 'WITH' statements would help. Maybe the query is just too complex for this query-rewrite/parameter substitution engine n the ODBC driver. If that not helps, there are some logging options for the driver, I would use those to see if they give any useful information what is happening inside the driver.
... View more
10-01-2024
01:25 PM
1 Kudo
Hi @evanle96 ! Could this issue be related to HDFS failover and the HA configuration being affected by the deleted directories? No, I don't think so. Deleting directories should not affect NN failover or HA configuration, unless there is something is fundamentally wrong with your setup or hardware. You might elaborate a bit more on what happened here? How can I validate whether the problem is related to HDFS HA and failover? What you mention in the your last question: Triggering a manual failover and checking if basic read write from CLI works, that should be a good start. Is there a way to force Sqoop/Oozie to properly use the active NameNode instead of the standby? HDFS clients in general should have a list of all NameNodes available to them. If the client gets the above error when connecting, it should try to connect the next available NN. If that's not happening, likely there is some issue with the client's configuration (core-site.xml, hdfs-site.xml). It is possible that it only knows about one NN (which is the standby), or the config is outdated, and pointing to an old, decommissioned host, or it cannot connect due to network issues. Your logs should tell more if the job is actually trying to fail-over to the other NN, so a bit more context around the error message (more logs) would be useful to see what's going on exactly. I have checked the HA configuration, and failover seems to be functioning as the standby takes over when the active NameNode is restarted. However, the error persists when trying to read or write to HDFS. Do you mean the sqoop job fails, or you cannot read/write with simple HDFS CLI commands, no matter what NN is the active?
... View more
10-01-2024
03:29 AM
1 Kudo
The error message shows that Impala gets the query with question marks in it, which is not good, as Impala itself doesn't supports prepared statements or query parameters. All of this functionality should be done in the ODBC driver. You've written that a simple query without subquery works. Does the simple query works with or without parameter substitution? Since the whole prepared statement/query substitution is don by the ODBC driver, and not by Impala, you would get no performance gains from using it. So I believe this only useful if you are porting some existing code/queries to use Impala. You can just use a python f-string or the .format() function to do the parameter substitution by yourself in your code, it won't hurt performance.
... View more
09-30-2024
04:51 PM
1 Kudo
@disoardi Parametric queries do only work if the "UseNativeQuery" option is set to 0, this is the default (but you might have it set to 1 in the DSN configuration). Yo could try connecting with: crsr = pyodbc.connect('DSN=impala;UseNativeQuery=0', autocommit=True).cursor() See page 84 of the "Cloudera ODBC Connector for Apache Impala Installation and Configuration Guide" for the full description of this option.
... View more