I run spark-sql queries with a thrift server.
I know that if multiple sql queries are submitted through the thrift server each query would be run sequentially.
If many users want to query the table on a spark cluster over yarn at the same time, how these requested queries could be run concurrently?
The requested query do not update the table and just query
I have an idea that because a thrift server has dedicated executor cluster if multiple thrift servers are used multiple queries could be processed concurrently.
Is there any idea about this situation?
Thanks in advance.
Have you taken a look at http://spark.apache.org/docs/1.6.2/job-scheduling.html? Also, if you start the thrift server in yarn-client mode, you should be able to take advantage of YARN resource scheduling and queues.
View solution in original post