After searching the web this questions seems to go somewhat unanswered. Let's say you have a process that does 2 things:
1) joins users with social groups
2) joins bars with social groups
These operations are not dependent on each other and don't require sequential operation. Meaning you don't need to wait on query 1 before performing query 2.
Has anyone used Java Futures or another pattern to run queries in parallel on Spark? In some cases I see this as the best way to take advantage of more cores and ram.
Spark works on one query at a time. The only way around it is to open multiple sessions of Spark. If you have free memory/cores now while running your queries then you are requesting too much initially.