I am trying to run a spark application which is reading data from hive tables into dataframes and joining them. When i try to run the dataframe individually in spark shell then all joins works fine and i am able to persist data in ORC format in HDFS.
But when i run it as an application using spark submit i am getting below mentioned error.
Missing an output location for shuffle 2
I did a research on this and found this to be related to Memory issue. I am not getting that why this error is not coming in spark shell even with the same configuration and i am able to persist everything.
Command i am using to run application is mentioned below