Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Keep alive configuration for Hive clients working with LB Proxy

avatar
Explorer

Hi All,

 

We have recently configured an F5 proxy for our HiveServer2 services in order to support increasing scale in clients accessing HS2 service, following this procedure:

http://www.cloudera.com/documentation/other/reference-architecture/PDF/Impala-HA-with-F5-BIG-IP.pdf

 

We started with session timeout of 1 hour but we quickly found out that for long running Hive queries the proxy is killing the connection once the timeout is reached.

 

I assumed that Hive connections using beeline\PyHive\Cloudera ODBC\etc. are aware of the fact that Hive processes are usually long, hence, should implement a keep alive mechanism to keep the connection active until the Hive process finish.

To my surprise none of the Hive clients we are using, implements such a keep alive mechanism and only when we increased the proxy's session timeout to be longer than our longest Hive query, our long Hive processes stopped being killed by the proxy.

 

Digging a bit deeper to HS2 configuration I found the parameter hive.server2.idle.session.timeout which is set to 12 hours, and I understood why before using the proxy all our Hive processes worked perfectly.

 

Our network guys said that setting session timeout at the proxy level to be 12 hours is not best practice and that the clients accessing HS2 should implement keep alive mechanism.

 

Is there a better way addressing this keep alive issue? Or setting the proxy's session timeout to be bigger than the longest query is the way to go?

 

Best,

 

Eyal

5 REPLIES 5

avatar
Super Guru
HiveServer2 will NOT count session as idle if query is still running, so, a query can run longer than the session timeout, so long as the query is running, but not waiting for client to fetch data. Please refer to https://issues.apache.org/jira/browse/HIVE-10146.

This looks like it is F5's issue that it does not know query is still running, which makes sense.

Having keep alive pings to HS2 will make Hive Session timeout setting useless, I am not sure if that is best practice either.

avatar
Explorer

Hi Eric, 

 

In order to setup the proxy for HS2 using F5 I followed the exact steps described in the following Cloudera "Impala HA with F5 BIG-IP" manual:

http://www.cloudera.com/documentation/other/reference-architecture/PDF/Impala-HA-with-F5-BIG-IP.pdf

 

 

I known the above manual is for setting up Proxy for Impala but I don't see how for HS2 the steps should change.

 

If this is an offical Cloudera manual, how come only when increaseing the F5 proxy timeout itself to 12H (from 1H) both Impala & HS2 session running long queries (longer than 1H) stopped being killed?

 

Was the above procedure made against a specific F5 version?

 

Best,

 

Eyal

avatar
Explorer

Hi Eric,

 

We followed Cloudera's link to how to set up a LB using F5 and still had to increase the session timeout to 12 hours to stop long processes from failing, are we missing anything?

 

Do you know of other companies that implemented the LB using F5 and had similar issues?

 

Best,

 

Eyal

avatar
New Contributor
F5 has idle timeout of 300 sec as default.
Extending it is not best practice simply because idle timeout is used for idle connections that don't communicate for a period of time this considered dead. As wll as for events where a connection didnt tear down well.
This also protects f5 or any other lb from consuming its connections table as it also might cause a denial of service.

Kepalive makes the connection duration control on both sides. Network should be transparent to it.

avatar
New Contributor
Btw
Having keeapalives would also protect the servers temselves from thecsame readons