I added some new nodes to my cluster and it works fine. Then I add Spark Gateway roles to all the new nodes. We're using Yarn to manage and distribute Spark work.
Does adding Spark Gateway roles to new nodes enough to make Yarn think like "Hey there are some new nodes here, let's distribute some containers and work to these new nodes"? Or do I have to add Yarn Gateway roles to these new nodes too?
How to make sure that Yarn will use these new nodes when executing jobs to reduce the overall workload of my cluster
When we submit the spark using YARN, based on YARN resources application will run. In your case you need to add more YARN Gateway nodes to process with more resources. We can't process the data by only added new nodes and yarn will distribute processing all nodes.