Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Impala query troubles

avatar
Rising Star

Hello,

I am facing an issue with Impala queries, though the query status says 'finished' on Impala status page it still shows 'executing'. I was following Cloudera Article : https://my.cloudera.com/knowledge/Finished-Queries-show-as-Executing-in-the-Cloudera-Manager?id=7157...

the difference  in my case is queries were run through impala shell via jdbc connection.

would the resolution outlined in the article be applicable, for queries submitted through impala jdbc connection? and secondly what would be the fix for such queries which are in hung state.

 

Regards

Amn

1 ACCEPTED SOLUTION

avatar

I assume you're seeing something like what I attached here. I ran that query from impala-shell and it's going to sit in the FINISHED state for several minutes while the shell fetches all of the ~6 million rows.

 

The reason is that it just takes a bunch of time for the client to fetch the results, even if it's a client like impala-shell or JDBC that actively fetches the results. You'll see this for queries with large result sets, particularly if the connection from client to server is slow or has higher latency. I'd expect that the client will eventually get to the end of the result set and close the query on its own.

 

Hue is a bit different because it only fetches results as needed, so can hold queries open even if they have small result sets.

 

There will be some significant perf and resource management improvements for use cases like this in the versions of Impala that come with CDP - e.g. https://issues.apache.org/jira/browse/IMPALA-8656 helps with this in various ways.


finished-debug-ui.png

View solution in original post

3 REPLIES 3

avatar
Super Guru
@Amn_468,

Can you please explain a bit more on "run through impala shell via jdbc connection"? Which JDBC driver and how did you set it up?

The article is generic, as the setting is set at the impala server level, so session will timeout from impala daemon and when that happens, all clients' connections will be closed, together with the queries in those sessions.

Cheers
Eric

avatar

I assume you're seeing something like what I attached here. I ran that query from impala-shell and it's going to sit in the FINISHED state for several minutes while the shell fetches all of the ~6 million rows.

 

The reason is that it just takes a bunch of time for the client to fetch the results, even if it's a client like impala-shell or JDBC that actively fetches the results. You'll see this for queries with large result sets, particularly if the connection from client to server is slow or has higher latency. I'd expect that the client will eventually get to the end of the result set and close the query on its own.

 

Hue is a bit different because it only fetches results as needed, so can hold queries open even if they have small result sets.

 

There will be some significant perf and resource management improvements for use cases like this in the versions of Impala that come with CDP - e.g. https://issues.apache.org/jira/browse/IMPALA-8656 helps with this in various ways.


finished-debug-ui.png

avatar
Rising Star

Hi Tim,

Thanks for you reply, we ran couple of other queries and don't encounter this issue, we will resolve this as a one of case, appreciate your explanation in this regard.