I am trying to submit a Spark2 app via YARN REST API, by following this turtorial from Hortonworks:
https://community.hortonworks.com/articles/28070/starting-spark-jobs-directly-via-yarn-rest-api.html. Here is the general flow: GET a new app ID, then POST a new app with the ID and several parameters as a JSON object. However, since Spark2 the JSON parameter format has changed whereas this tutorial is about Spark1.6. More specifically, for Spark1.6, the Spark assembly JAR, the app JAR, and the spark-yarn properties files are provided as "local-resources" and cached files, according to the tutorial. But for Spark2, there is no local-resource or cached file, and the Spark assembly JAR and properties are provided as "resources". See more details on the two logs attached here.
As a result, the Spark assembly JAR is not visible to the container of YARN. Thus, although I was able to submit an app, it would always finish as FAILED. More specifically, "Could not find or load main class org.apache.spark.executor.CoarseGrainedExecutorBackend" would be complained on the container log (see attachment), where the "CoarseGrainedExecutorBackend" was used in the command to submit an app. So, what is the correct JSON parameter format to be used to submit a Spark2 app via YARN REST API? Thanks, Kun manual-spark16-oklog.txt manual-spark21-oklog.txt rest-container-faillog.txt rest-spark2-faillog.txt
... View more