Created on 08-30-201804:46 PM - edited 08-17-201906:33 AM
Why to use spark-submit if spark-shell is there ?
1. Spark shell spawns executors on random nodes and hence chances of data Locality will be very less. 2. Spark-submit based on the Nodes where the data is saved spawns the excutors hence spark-submit will be more performant as compared to spark-shell.
3. Spark shell is good in situations when data exploration needs to be done as it provides a interactive CLI to run your code.