Member since
05-26-2016
2
Posts
1
Kudos Received
0
Solutions
05-26-2016
06:57 PM
1 Kudo
A quick workaround for the problem is like so: In workflow.xml: ``` <spark-opts>--conf spark.hadoop.yarn.resourcemanager.address=your-rm:8050<spark-opts> ``` yarn client will then connect to the correct rm. More doc on spark-opts: https://oozie.apache.org/docs/4.2.0/DG_SparkActionExtension.html You just need to append the string in the above <spark-opts> xml element to your already existing spark-opts if any. The problem occurs when yarn client tries to connect to rm and get cluster metrics: https://github.com/apache/spark/blob/f47dbf27fa034629fab12d0f3c89ab75edb03f86/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L154 but fails to get configuration for the rm address. Actually once you fix this, you will realize that the rm's address is not the only configuration that the spark's yarn client is unable to pick up. So your misery won't end. A proper workaround, which I haven't been able to reach yet, should probably tell oozie/spark to take yarn configuration from the hadoop configuration already existing in the cluster. If anyone can point out any spark option that can do that, please let me know. @hortonworks: Please include a preview mode for answers so that we can check how the formatting looks
... View more