Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to do load balancing spark thrift servers on HWX?

avatar
Expert Contributor

Installed version of HDP is 2.3.4. how to load balancing spark thrift servers on HWX?

1 ACCEPTED SOLUTION

avatar

Hello Kavita,

I have not found any doc to put a load balancer in front of STS when the cluster is kerberized (hence the post here 🙂 ).

HiveServer2

Load balancing in front of HiveServer2 in a kerberized environment can be achieved by invoking the zookeeper -- see doc here for how it works: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_hadoop-ha/content/ha-hs2-service-discove...

This worked out of the box on HDP-2.3 (all the configuration necessary was already set in hive-site), the props are

hive.server2.support.dynamic.service.discovery=true
hive.server2.zookeeper.namespace=sparkhiveserver2
hive.zookeeper.quorum=zk_host1:port1,zk_host2:port2,zk_host3:port3...

Spark Thrift Server

I have replicated a similar configuration in my /etc/spark/conf/hive-site.xml but it did not work. It appears this functionality is currently being added to the Apache-Spark (so we will have to wait a bit longer for it be included in the HWX distro). See:

  1. this JIRA reporting that STS is not registering with Zookeeper like HS2 does: https://issues.apache.org/jira/browse/SPARK-11100
  2. this github pull request -- it seems a fix has been written and could be merged into the Master branch in the coming weeks: https://github.com/apache/spark/pull/9113

So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work.

Cheers!

View solution in original post

7 REPLIES 7

avatar
Master Guru

@kavitha velaga You can use a virtual or physical load balancer and use methods ie round robin, ratio, dynamic ration, least connections, etc. Does that help?

avatar

Is it possible to implement Load Balancing in front of multiple spark thrift servers (sts) if the cluster is kerberized? Ie: How to get around the fact that the host-specific principal has to be mentioned in the connection string. See attempts below (a kinit was done before -- the linux user has a valid TGT):

#1 Direct connection to STS (no load balancer):

Both of these connection strings work since the keytab for hive/sts_host1_fqdn@REALM is present on sts_host1.

$ beeline -u "jdbc:hive2://sts_host1:10001/default;principal=hive/sts_host1_fqdn@REALM" 
or
$ beeline -u "jdbc:hive2://sts_host1:10001/default;principal=hive/_HOST@REALM"

(_HOST will resolve to the sts_host1's fqdn).

#2 Connection via Load Balancer to one of the STSs:

This will only work if load balancer forwards the request to sts_host1 (since the only sts_host1 has the keytab for hive/sts_host1_fqdn@REALM).

$ beeline -u "jdbc:hive2://sts_loadbalancer_host:10001/default;principal=hive/sts_host1_fqdn@REALM"
...
Error: Could not open client transport with JDBC Uri: jdbc:hive2://sts_loadbalancer_host:10001/default;principal=hive/sts_host1_fqdn@REALM: Peer indicated failure: GSS initiate failed (state=08S01,code=0)

This seemed like a good solution but does not work at all, regardless of the sts the request is forwarded to. (It seems _HOST is resolved to the load balancer fqdn -- no keytab for this. We have also tried creating a principal lb/lb_fqdn@REALM and setting the keytab on the servers in /etc/security/keytabs and using this principal in the connection string but this did not solve the issue).

$ beeline -u "jdbc:hive2://sts_loadbalancer_host:10001/default;principal=hive/_HOST@REALM"
...
16/04/26 15:37:33 [main]: ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER)]
Finally, we have tried to specify Spark's principal in the connection string since it is not host-dependent, but this principal is refused as is does not 'contain 3 parts' separated by either '/' or '@' (ala: name/host_fqdn@REALM).
$ beeline -u "jdbc:hive2://sts_loadbalancer_host:10001/default;principal=spark-cluster_id@REALM" 
...
Kerberos principal should have 3 parts: spark-cluster_id@REALM

Thanks for posting a reply if you have mastered the kerberized-loadbalanced-spark-thrift-server dragon in the past!

Then there will be the question of session stickiness for beeline / JDBC connections sending more than one request but one problem at time... 🙂

avatar
Expert Contributor

Thank you all. Raphael, I didn't find the documentation for this. Can you please send me the link?

avatar

Hello Kavita,

I have not found any doc to put a load balancer in front of STS when the cluster is kerberized (hence the post here 🙂 ).

HiveServer2

Load balancing in front of HiveServer2 in a kerberized environment can be achieved by invoking the zookeeper -- see doc here for how it works: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_hadoop-ha/content/ha-hs2-service-discove...

This worked out of the box on HDP-2.3 (all the configuration necessary was already set in hive-site), the props are

hive.server2.support.dynamic.service.discovery=true
hive.server2.zookeeper.namespace=sparkhiveserver2
hive.zookeeper.quorum=zk_host1:port1,zk_host2:port2,zk_host3:port3...

Spark Thrift Server

I have replicated a similar configuration in my /etc/spark/conf/hive-site.xml but it did not work. It appears this functionality is currently being added to the Apache-Spark (so we will have to wait a bit longer for it be included in the HWX distro). See:

  1. this JIRA reporting that STS is not registering with Zookeeper like HS2 does: https://issues.apache.org/jira/browse/SPARK-11100
  2. this github pull request -- it seems a fix has been written and could be merged into the Master branch in the coming weeks: https://github.com/apache/spark/pull/9113

So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work.

Cheers!

avatar
Master Guru

@Raphael Vannson Great analysis. is this only true for kerberized cluster?

avatar

These are HDPs' STS and HS2 load balancing current capabilities for a kerberized cluster I am aware of.

For a non-kerberized cluster: haproxy, httpd +mod_jk or any other soft / hard load balancer will probably do the work.

avatar
Master Guru

@Raphael Vannson you mean aren't correct?