Support Questions
Find answers, ask questions, and share your expertise

Impala close JDBC session

Impala close JDBC session

Explorer

Hi guys,
tl;dr: is it possible to close an Impala session opened through a Java client with JDBC driver?

 

Long version: I am migrating a Java application from making queries to a traditional DBMS to making query to Impala through JDBC. The application must stay alive for days executing sequentially hundreds of thousands of very fast queries (about a second each). Moreover, it is not a single application running, but more applications that share a common list of queries to do. Each application for each query creates a new connection that at the end of the execution gets closed (Connection Pool not implemented/implementable at the moment). What I am experiencing is that the session is not really closed (Impalad reports the session as active) and in some minutes I fill the available slots of incoming front-end connections (in my cluster is 64 for each Impalad, default "fe_service_threads" variable) and I can't open any more connection/session. The only way to kill old sessions is to kill the Java processes (but that is not a real solution).
I found that by setting in the impalad startup variable "idle_session_timeout", sessions expires after specified seconds, still they remain marked as not closed and Impalad/CM reports them.
I saw that it is possible to kill HUE=>Impala session with "close_sessions" command (http://gethue.com/hadoop-tutorial-hive-and-impala-queries-life-cycle/). I need a similar solution for my JDBC sessions.
Each connection is opened with a code extremely similar to the one reported here: http://www.cloudera.com/content/cloudera/en/documentation/connectors/latest/PDF/Cloudera-JDBC-Driver... .

 


Thanks for any suggestion,
Michele

7 REPLIES 7

Re: Impala close JDBC session

Master Collaborator

Hi Michele,

 

thanks for your report. Could be a bug in the JDBC driver. I'd expect that when the JDBC Connection is closed, the session is also terminated.

 

I've filed https://issues.cloudera.org/browse/IMPALA-1905 to track progress on this issue.

 

You might give Hive's JDBC driver a try instead. If you go down this path, would you mind reporting whether it resolved your issue?

 

Thanks!

 

Alex

Re: Impala close JDBC session

Explorer

Hi Alex,

thanks for answering, you have also correctly matched me and Alessio, we work together (and trying together to solve the same problem).

 

We have already tried Hive JDBC driver, same result. Now I am making some other experiments to find a workaround waiting for a real patch, something like killing old TCP connections. Will let you know if I find something acceptable that may be useful also for others in the meantime.

 

 

Bye,

Michele

Re: Impala close JDBC session

Master Collaborator

Hi Michele,

 

even if there's an issue with the JDBC driver, Impala's session timeout should still work. It's surprising that it doesn't. To get to the bottom of that problem, you might try setting the session timeout and the monitoring impalad's RPC activity from the WebUI (impalad:25000/rpcz). A session will time out after a period of inactivity.

 

Alex

Re: Impala close JDBC session

Explorer

Hi Alex,

if you are speaking about "--idle_session_timeout" parameter, after timeout the session is marked as EXPIRED in impalad:25000/sessions, but not as CLOSED. Also looking on Cloudera Manager, the connection is considered in the graph showing the number of active connections.

Looking with "netstat -a" on the OS I can see the connection still reported as ESTABLISHED, so Impalad is not closing it (correct with respect to http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_timeo... where it seems that the client should be the one closing the session, not Impalad).

 

At the moment we solved our problem forcing the Java program to execute garbage collector more often, it is not a good solution but at least it is working..

 

 

Michele

Re: Impala close JDBC session

Master Collaborator

Hi Michele,

 

ahh yes, you are right, of course.

 

Your workaround sounds like a very interesting detail that may help identify the problem and expedite a fix!

Thanks for following-up, and my apologies for this inconveinence.

 

Best,

 

Alex

Re: Impala close JDBC session

Hi,

 

We are experiencing the same problem when connecting Tableau with Impala.

I've set a short value to the property idle_session_timeout (3 minutes) and I can easily reproduce this.

So after those 3 minutes the session is "expired" but shown as "not closed" in Impala daemon WebIUI.

Also from command line in the linux box the netstat -nap command shows that the connection from the Tableau server (client) is ESTABLISHED.

We use Impala ODBC driver.

 

What I don't understand is why the client should be the one closing the connection?

How does the client (eg. Tableau) know about the expired session in Impala after the idle timeout?

Is Impala sending such command to the client?

Is the ODBC driver provided by Cloudera (acting as the client in this case), missing any "expired session" message?

 

This is a big pain in the neck since the, once the timeout has expired, the users cannot refresh the Tableau dashboard any more, till Tableau or ImpalaD is restarted.

 

Tableau shows that Impala connection has expired

New Contributor

We are having the same issue as below. Has anyone found a solution other than restarting the services?

 

We are experiencing the same problem when connecting Tableau with Impala.

I've set a short value to the property idle_session_timeout (3 minutes) and I can easily reproduce this.

So after those 3 minutes the session is "expired" but shown as "not closed" in Impala daemon WebIUI.

Also from command line in the linux box the netstat -nap command shows that the connection from the Tableau server (client) is ESTABLISHED.

We use Impala ODBC driver.

 

What I don't understand is why the client should be the one closing the connection?

How does the client (eg. Tableau) know about the expired session in Impala after the idle timeout?

Is Impala sending such command to the client?

Is the ODBC driver provided by Cloudera (acting as the client in this case), missing any "expired session" message?

 

This is a big pain in the neck since the, once the timeout has expired, the users cannot refresh the Tableau dashboard any more, till Tableau or ImpalaD is restarted.