Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Using Datagrip to query Impala: socket write error

Using Datagrip to query Impala: socket write error

New Contributor

I am using datagrip to query Impala with Cloudera's JDBC driver. Due to the timeout configured in our cluster, my connection from datagrip with the driver timeouts very quickly.

 

When I try to make a query when the connection already expired I get a "socket write error". I have to close and reopen the connection manually, which is very annoying.

 

Other products like Toad for Hadoop, which comes with its own driver, automagically handles this case.

 

Is there any plan to improve the JDBC driver to avoid getting the "socket write error"?

5 REPLIES 5

Re: Using Datagrip to query Impala: socket write error

Expert Contributor

Hi Gire,

 

I don't know if there is a way to automagically fix this. However I checked the manual ("Cloudera JDBC Driver for Impala") and there is a SocketTimeout option you can set.

 

The number of seconds after which Impala closes the connection with the client application if the connection is idle.

When this property is set to 0, idle connections are not closed.

 

Can you try and see if this fixes your problem? Cheers, Lars

Re: Using Datagrip to query Impala: socket write error

New Contributor

Hi Lars,

I tried setting the SocketTimeOut (SocketTimeout?) property without luck. The connection seems to expire very quickly and I keep getting the socket write error. The property seems to have no effect at all in the actual timeout time.

 

I tried the following connection strings

 

 

jdbc:impala://cluster_host:21050/database;AuthMech=0;transportMode=binary;SocketTimeOut=0
jdbc:impala://cluster_host:21050/database;AuthMech=0;transportMode=binary;SocketTimeOut=600

jdbc:impala://cluster_host:21050/database;AuthMech=0;transportMode=binary;SocketTimeout=0
jdbc:impala://cluster_host:21050/database;AuthMech=0;transportMode=binary;SocketTimeout=600

 

I am confused because this document both uses the SocketTimeOut and SocketTimeout name for the timeout property while at the same time mentioning that the property names are case sensivite.

 

I am using driver 2.5.32.1052 64-bit for windows.

Re: Using Datagrip to query Impala: socket write error

New Contributor

Do you think the above behaviour could be a driver bug?

Re: Using Datagrip to query Impala: socket write error

Expert Contributor

Hi Gire,

 

Apologies for the late reply.

 

I had a look at the documentation you linked to and they only seem to mention SocketTimeout (lowercase 'o'). I also tried various settings of that variable in the connection string with the beeline JDBC client. However, none of them caused the connection to timeout. Can you try to reproduce your problem with beeline, too?

 

It might as well be a driver bug, but at this point I'm not sure. Do you currently have an active support subscription with Cloudera? Can you provide a detailed step-by-step guide to reproduce the problem?

 

Thanks, Lars

Re: Using Datagrip to query Impala: socket write error

New Contributor

Hey Lars,

Unfortunately, we do not have a support subscription.

 

I don't have the beeline command line installed. Below are the steps to reproduce the error I am getting.

 

  1. Install datagrip 2016.2.1 (64-bit)
  2. Download Cloudera's JDBC driver version 2.5.32.1052 for windows 64 bit
  3. Create a new data driver in datagrip (add jars)
  4. Select Class: com.cloudera.impala.jdbc41.Driver
  5. Create a data source. My current string connection looks like: jdbc:impala://server:21050/database;AuthMech=0;transportMode=binary;SocketTimeOut=600
  6. Set a really low timeout value on the server side (I don't have access to the configuration nor do I know how to set this value. I only know it is around 10 sec)
  7. Run queries

If I run a query after the timeout, first I get a socket write error or a TTransportException.

 

I read in the release notes that in version 2.5.30 the following point was addressed:

  • When the setQueryTimeout() method is called and the query processing time exceeds the query timeout value, the driver returns a socket timeout error
    The driver now returns the correct error for query timeouts (SqlTimeoutException).

 

Could it be a regression? I am definitely still getting the socket write error