Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Run Spark Thrift server on multiple nodes

avatar
Rising Star

Currently I running only one Thrift server on my cluster.

I see if I have many client connections, Spark run very slow.

I find the solution for this and I see the doc: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/install-sp...

It said: Deploying the Thrift server on multiple hosts increases scalability of the Thrift server; the number of hosts should take into consideration the cluster capacity allocated to Spark.

So If I deploy more Thrift server on my cluster, Can I handle multiple concurrent client connections ? If not, how to I can handle this issue ?.

And other question, Can I running two Spark apps with different configure on each?

1 ACCEPTED SOLUTION

avatar
Explorer
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
5 REPLIES 5

avatar
Rising Star

I think you need load balancing for Spark Thrift servers.

Why don't you refer this link?

'https://community.hortonworks.com/questions/29687/how-to-do-load-balancing-spark-thrift-servers-on-h.html'

avatar
Rising Star

@youngick kim

I refer that link suggest, but I can not see any guide to solve my issue.

Best answer saild :

So for now... no load balancing for STS if the cluster is kerberized, otherwise haproxy, httpd +mod_jk or any other load balancer will probably do the work.

P/S: Currently, my cluster do not enable kerberrized

avatar
Explorer
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Rising Star

So I need use third party service like Haproxy to running Thrift server load balancing on HDP

avatar
Explorer

yes, that is my understanding.