Member since
07-29-2015
535
Posts
140
Kudos Received
103
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6073 | 12-18-2020 01:46 PM | |
3943 | 12-16-2020 12:11 PM | |
2795 | 12-07-2020 01:47 PM | |
1992 | 12-07-2020 09:21 AM | |
1279 | 10-14-2020 11:15 AM |
06-18-2019
03:30 PM
Thanks a lot, Tim!
... View more
06-14-2019
11:49 PM
Thanks for your quick reply.
... View more
06-14-2019
02:06 AM
how to do the same in unique way for all the columns, i mean inspite of providing numeric values what if i give the db.column_name ?
... View more
06-13-2019
11:49 AM
1 Kudo
So IDLE_SESSION_TIMEOUT doesn't actually do anything if you set it in impala-shell. There's a technical distinction here - "set" in impala-shell is implemented in the shell itself, then send along with the queries. Whereas if you run "set" as a statement through, say, the JDBC driver, then "set" is run as a server-side command that modifies the session state. I tried to reproduce the issue on my local Impala environment, but couldn't. I was watching the /sessions page in the debug page and I could see the "last accessed" time get continually incremented while the query was in the queue. I think any timeout issue probably got indirectly fixed by https://issues.apache.org/jira/browse/IMPALA-5216 in CDH6.1/Impala 3.1, since the query state would be polled directly by impala-shell while the query is in the queue, rather than having impala-shell blocked waiting for a response. Thanks for submitting the support request, that should help get to the bottom of this.
... View more
06-13-2019
10:42 AM
Yeah I agree there is some inconsistency in behaviour here - the casting rules, especially around NULL, are too complex and inconsistent.
... View more
05-06-2019
05:12 PM
The output of "explain <query>" is often helpful too.
... View more
04-24-2019
08:51 AM
Could you advise if there is a solution to the problem, when Impala assigns heavy query parts to busy executors. For example the following was faced at CDH 5.16 with Impala 2.12.0: Impala has several (let's say 5) executors each having ~100GB RAM. Impala admission control is used. The mem_limit is set default (or about default ~80%), e.g. 80GB. The first relatively long and heavy query (let's name it Query1) comes and one of its steps take ~70GB RAM at executor1, i.e. there is ~10GB available RAM at this executor for reservation. Other 4 executor servers are nearly idle. At the same time the second query (let's name Query2) comes, which requires 40GB RAM, and it might happen the Query2 is assigned to the executor1, which is busy. So the Query2 fails due to it cannot allocate/reserve the memory. Is there a way to configure Impala to assign fragments/query parts to less busy executors? So far the concurrency reduction or reservation removal (since reserved memory amount usually is larger than really used) might work, but I see it too inefficient to use only 1-2 executors out of 5. Impala on YARN potentially might help, but as far as I see, it requires Llama, which is deprecated and is going to be removed soon.
... View more
04-19-2019
02:30 PM
1 Kudo
https://www.cloudera.com/documentation/enterprise/latest/topics/impala_explain_plan.html#explain_plan is our high level doc. I would recommend starting with summary to understand where time is spent, then using the profile to drill down into individual nodes. WorkloadXM can help a lot automate the analysis process to understand bottlenecks.
... View more
04-18-2019
10:09 AM
2 Kudos
If you are mainly accessing the table using Impala, I'd recommend Impala's compute stats for best performance of Impala. There are some subtle differences in the stats collected (whether they're partition or table-level). The engines can interoperate but Impala can generally generate better plans with the full set of stats from "COMPUTE STATS"
... View more
04-17-2019
10:28 PM
Thank you very much Tim. Provided link has clarified my doubt.
... View more