07-10-2018 10:07 PM
Going by my understanding, Spark Standalone mode has a master-slave architecture with two daemons running(Spark Master and Spark Worker).
Since the Spark Master is handling both the resource scheduling as well as monitoring individual jobs then isn't that an issue when it comes to scalability?
Eg: Similar was a limitation of MRv1 as well where the Job Tracker had a responsibility of both managing resources and monitoring tasks(that is why it couldn't scale). That is why we had Application Master in MRv2 to delegate a part of the responsibility. But we don't have AM in Spark Standalone, so overall how scalable it is?