Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Run Spark Thrift server on multiple nodes

Solved Go to solution

Run Spark Thrift server on multiple nodes

Currently I running only one Thrift server on my cluster.

I see if I have many client connections, Spark run very slow.

I find the solution for this and I see the doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/install-sp...

It said: Deploying the Thrift server on multiple hosts increases scalability of the Thrift server; the number of hosts should take into consideration the cluster capacity allocated to Spark.

So If I deploy more Thrift server on my cluster, Can I handle multiple concurrent client connections ? If not, how to I can handle this issue ?.

And other question, Can I running two Spark apps with different configure on each?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Run Spark Thrift server on multiple nodes

Explorer

Hoang,

I think what that post is saying is while there is not yet a load balancing solution for STS on a secure cluster, in a non-Kerberos environment any external load balancer, like haproxy, or httpd +mod_jk can be used to distribute requests. The Thrift servers will be independent (no Zk coordination) and the load balancer will simply round robin client requests.

5 REPLIES 5

Re: Run Spark Thrift server on multiple nodes

Contributor

I think you need load balancing for Spark Thrift servers.

Why don't you refer this link?

'https://community.hortonworks.com/questions/29687/how-to-do-load-balancing-spark-thrift-servers-on-h.html'

Re: Run Spark Thrift server on multiple nodes

@youngick kim

I refer that link suggest, but I can not see any guide to solve my issue.

Best answer saild :

So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work.

P/S: Currently, my cluster do not enable kerberrized

Re: Run Spark Thrift server on multiple nodes

Explorer

Hoang,

I think what that post is saying is while there is not yet a load balancing solution for STS on a secure cluster, in a non-Kerberos environment any external load balancer, like haproxy, or httpd +mod_jk can be used to distribute requests. The Thrift servers will be independent (no Zk coordination) and the load balancer will simply round robin client requests.

Highlighted

Re: Run Spark Thrift server on multiple nodes

So I need use third party service like Haproxy to running Thrift server load balancing on HDP

Re: Run Spark Thrift server on multiple nodes

Explorer

yes, that is my understanding.

Don't have an account?
Coming from Hortonworks? Activate your account here