Created 11-27-2019 01:47 AM
Hello,
I am facing an issue with Impala queries, though the query status says 'finished' on Impala status page it still shows 'executing'. I was following Cloudera Article : https://my.cloudera.com/knowledge/Finished-Queries-show-as-Executing-in-the-Cloudera-Manager?id=7157...
the difference in my case is queries were run through impala shell via jdbc connection.
would the resolution outlined in the article be applicable, for queries submitted through impala jdbc connection? and secondly what would be the fix for such queries which are in hung state.
Regards
Amn
Created 11-27-2019 11:18 AM
I assume you're seeing something like what I attached here. I ran that query from impala-shell and it's going to sit in the FINISHED state for several minutes while the shell fetches all of the ~6 million rows.
The reason is that it just takes a bunch of time for the client to fetch the results, even if it's a client like impala-shell or JDBC that actively fetches the results. You'll see this for queries with large result sets, particularly if the connection from client to server is slow or has higher latency. I'd expect that the client will eventually get to the end of the result set and close the query on its own.
Hue is a bit different because it only fetches results as needed, so can hold queries open even if they have small result sets.
There will be some significant perf and resource management improvements for use cases like this in the versions of Impala that come with CDP - e.g. https://issues.apache.org/jira/browse/IMPALA-8656 helps with this in various ways.
Created 11-27-2019 04:05 AM
Created 11-27-2019 11:18 AM
I assume you're seeing something like what I attached here. I ran that query from impala-shell and it's going to sit in the FINISHED state for several minutes while the shell fetches all of the ~6 million rows.
The reason is that it just takes a bunch of time for the client to fetch the results, even if it's a client like impala-shell or JDBC that actively fetches the results. You'll see this for queries with large result sets, particularly if the connection from client to server is slow or has higher latency. I'd expect that the client will eventually get to the end of the result set and close the query on its own.
Hue is a bit different because it only fetches results as needed, so can hold queries open even if they have small result sets.
There will be some significant perf and resource management improvements for use cases like this in the versions of Impala that come with CDP - e.g. https://issues.apache.org/jira/browse/IMPALA-8656 helps with this in various ways.
Created 11-28-2019 09:37 PM
Hi Tim,
Thanks for you reply, we ran couple of other queries and don't encounter this issue, we will resolve this as a one of case, appreciate your explanation in this regard.