1. I know in spark Client mode, the driver program runs in the local machine from where the shell is opened. Now, when I run from beeline, where does the driver run? I read Spark Thrift Server runs in client mode only. So when I connect from M1 to STS that is running on M2, the driver program runs on M1 or M2?
2. I read we don't have to set the number of executors in spark when dynamic allocation is enabled. Is that correct? How do I test that? Are there any other parameters I have to set? Because, I set dynamic allocation and started spark shell as:
spark-shell --master yarn --deploy-mode client
When I do this, I see in Ambari that the spark shell starts with only 3 containers. Then when I run a query, the number of running containers remains 3 throughout and the query execution is very slow. I was hoping the number will increase during the query run time and the execution time will be fastest.
Instead when I specify number of executors and executor memory at the start time, I get the best query execution times. The query I use for best performance is: