Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Frequent "Read timed out" error from Hive, using Cloudera Hive JDBC Driver

avatar
Explorer

Hi,

 

We've been seeing a frequent occurrence of the types of errors I'm going to quote below.  We are using the Cloudera Hive JDBC driver version 2.5.18.  (The 2.5.20 version is out there, we've not upgraded to, but doesn't seem to have any fixes related to the issues below).

 

CDH version = 5.9.1.

 

Does anyone have any idea as to why we may be frequently seeing these errors?

 

The Hive URL we're using looks like this (one string but breaking down property by property for readability):

jdbc:hive2://<host>:10000/default;
AuthMech=1;
KrbRealm=<realm>;
KrbHostFQDN=<host fqn>;
KrbServiceName=hive;
UseNativeQuery=1;
ssl=1;
sslTrustStore=<store path>;
trustStorePassword=<store pwd>

The errors below tend to happen when we try to obtain a connection.  Presumably, the SocketTimeout driver configuration option would not be relevant in this case (have considered setting it to 0 so that idle connections are not closed).

 

The errors:

 

Read timed out

org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
        at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
        at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
        at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
        at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
        at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_CloseSession(TCLIService.java:182)
        at com.cloudera.hiveserver2.hivecommon.api.HS2ClientWrapper.recv_CloseSession(Unknown Source)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.CloseSession(TCLIService.java:169)
        at com.cloudera.hiveserver2.hivecommon.api.HS2ClientWrapper.CloseSession(Unknown Source)
        at com.cloudera.hiveserver2.hivecommon.api.HS2Client.closeSession(Unknown Source)
        at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.connect(Unknown Source)
        at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:247)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:150)
        at java.net.SocketInputStream.read(SocketInputStream.java:121)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
        at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)

Out of sequence response

org.apache.thrift.TApplicationException: CloseSession failed: out of sequence response
        at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_CloseSession(TCLIService.java:182)
        at com.cloudera.hiveserver2.hivecommon.api.HS2ClientWrapper.recv_CloseSession(Unknown Source)
        at org.apache.hive.service.cli.thrift.TCLIService$Client.CloseSession(TCLIService.java:169)
        at com.cloudera.hiveserver2.hivecommon.api.HS2ClientWrapper.CloseSession(Unknown Source)
        at com.cloudera.hiveserver2.hivecommon.api.HS2Client.closeSession(Unknown Source)
        at com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.connect(Unknown Source)
        at com.cloudera.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.cloudera.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at java.sql.DriverManager.getConnection(DriverManager.java:664)
        at java.sql.DriverManager.getConnection(DriverManager.java:270)

Close session error

java.sql.SQLException: [Cloudera][HiveJDBCDriver](500151) Error setting/closing session: Close Session Error.
        at com.cloudera.hiveserver2.hivecommon.api.HS2Client.closeSession(Unknown Source)

Any insight would be appreciated.

3 REPLIES 3

avatar
Champion

Since its a generic exception , narrowing down to the issue could be little challenging 

so I would start from the below parameters in the HiveServer2 to be set appropiately for the enviroment . 

 

hive.server2.session.check.interval
hive.server2.idle.operation.timeout
hive.server2.idle.session.timeout

I believe you are runining a long runining query or job that fails with the below error ? 

Could you let me know whether you have  multiple HS2 instance or one in your enviroment ? 

avatar
Explorer

Hi csguna, thanks for your reply.

 

Yes indeed we have a job which connects to Hive, retrieves some data, then closes the connection. This is a finite job, with a start and stop within a few hours. Generally, 2-4 connections are made during the job's lifetime, where we connect, retrieve some data, then disconnect, or connect, insert some data via LOAD DATA INPATH, then disconnect.

 

This job fails frequently in environments with higher amounts of traffic/activity.  I believe there are two HS2 instances in that cluster.

 

hive.server2.session.check.interval=1 hour

hive.server2.idle.operation.timeout=1 day

hive.server2.idle.session.timeout=1 day

 

What worries me is that our session appears to just be closed with no reason as we're just trying to connect.

 

Inspecting the driver's code:

Variant localVariant3 = getOptionalSetting("SocketTimeOut", paramConnSettingRequestMap);
.......
if ((null == localVariant3) && (0 != m_settings.m_timeout)) { m_hiveClient.closeSession();

in com.cloudera.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.

 

I'm not following the logic behind the code which simply examines the value of SocketTimeout and if it happens to not be specified, then it closes the session.  Why?  And why would that fail with a "Read timed out"?

 

In theory, we can try running with the SocketTimeout set to 0 tacked onto the JDBC URL we're using.  But I'd really want to understand a) whether that's necessary, b) if so, then why, and c) is there something else that might be going on.

avatar
Explorer

The titled SocketTimeoutException occurs when the thrift-client in hiveConnection object is in the process of actively reading sql results from hive server2 (thrift server), and is not able to receive anything until the TSocket's time out occurs. You can check the source code from:HiveConnection.setupLoginTimeout and HiveAuthFactory.getSocketTransport.

So you need to either tuning hiveserver2, or increase the TSocket's timeout setting.

And for now, the only way to increase Tsocket's time out setting is via: DriverManager.setLoginTimeout()

 

you can check below jira for more information:

keep striving!