Support Questions

shariyar_murtaz · ‎08-10-2016

Hi,

I am trying to launch a spark application which works perfectly well from shell but executors fail when launched from OOzie. On the slaves (Executors) side, I see the following:

Error: Could not find or load main
class org.apache.spark.executor.CoarseGrainedExecutorBackend

On the driver side I see the following, but it is not really any null pointer in my code. My code is working fine when I launch spark directly from shell. It has something to do with executors.

[Driver] ERROR ogminer.main.LogMinerMain - nulljava.lang.InterruptedExceptionat java.lang.Object.wait(Native Method) ~[?:1.8.0_66]at java.lang.Object.wait(Object.java:502) ~[?:1.8.0_66]
at org.apache.spark.scheduler.JobWaiter.awaitResult(JobWaiter.scala:73)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:513)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1466)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1484)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1498) ~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1512)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.rdd.RDD.collect(RDD.scala:813)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.api.java.JavaRDDLike$class.collect(JavaRDDLike.scala:320)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at org.apache.spark.api.java.AbstractJavaRDDLike.collect(JavaRDDLike.scala:46)~[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]
at logminer.main.LogSparkTester.test(LogSparkTester.java:214)~[__app__.jar:?]
at logminer.main.LogMinerMain.testTrainOnHdfs(LogMinerMain.java:232)~[__app__.jar:?]
at com.telus.argus.logminer.main.LogMinerMain.main(LogMinerMain.java:159)[__app__.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_66]at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
~[?:1.8. 0_66]
atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
~[?:1.8.0_66]at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_66]
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:484)
[spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar:?]

I am not sure how to solve this issue. I have put all the spark related jars in the lib folder for this oozie job. Here is my directory structure on hdfs for this OOzie job

oozie/ oozie/workflow.xml oozie/job.properties

oozie/lib/argus-logminer-1.0.jar

oozie/lib/core-site.xml

oozie/lib/hdfs-site.xml

oozie/lib/kms-site.xml

oozie/lib/mapred-site.xml

oozie/lib/oozie-sharelib-spark-4.2.0.2.3.0.0-2557.jar

oozie/lib/spark-1.3.1.2.3.0.0-2557-yarn-shuffle.jar

oozie/lib/spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar

oozie/lib/yarn-site.xml

Does any know how to solve this? Any idea which jar has this: CoarseGrainedExceutorBackend class?

andrew_sears · ‎08-11-2016

CoarseGrainedExecutorBackend should be in spark-assembly.

Might be relevant to you...

https://issues.apache.org/jira/browse/OOZIE-2482

https://community.hortonworks.com/articles/49479/how-to-use-oozie-shell-action-to-run-a-spark-job-i....

https://developer.ibm.com/hadoop/2015/11/05/run-spark-job-yarn-oozie/

Try setting SPARK_HOME variable in hadoop-env.sh

cheers,

Andrew

View solution in original post

rreddy · ‎08-10-2016

What version of spark and hdp? Can you list out all jar under SPARK_HOME directory from worker machine in cluster?

shariyar_murtaz · ‎08-10-2016

Spark: 1.3.1 HDP: 2.3.0.0-2557

I don't see any SPARK_HOME variable in my shell. But here is the list of jar from hdp/currrent/spark-client /usr/hdp/current/spark-client/lib

datanucleus-api-jdo-3.2.6.jar

datanucleus-rdbms-3.2.9.jar

spark-assembly-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar

datanucleus-core-3.2.10.jar

spark-1.3.1.2.3.0.0-2557-yarn-shuffle.jar

spark-examples-1.3.1.2.3.0.0-2557-hadoop2.7.1.2.3.0.0-2557.jar

andrew_sears · ‎08-11-2016

CoarseGrainedExecutorBackend should be in spark-assembly.

Might be relevant to you...

https://issues.apache.org/jira/browse/OOZIE-2482

https://community.hortonworks.com/articles/49479/how-to-use-oozie-shell-action-to-run-a-spark-job-i....

https://developer.ibm.com/hadoop/2015/11/05/run-spark-job-yarn-oozie/

Try setting SPARK_HOME variable in hadoop-env.sh

cheers,

Andrew

shariyar_murtaz · ‎08-12-2016

Setting SPARK_HOME in hadoopn-env.sh solved the issue.

For others who have the same issue. Just add the following line in this file: /usr/hdp/your_version_number/hadoop/conf

export SPARK_HOME=/usr/hdp/current/spark-client

mramasami · ‎08-11-2016

Can you tell me which mode you are using? yarn-cluster/yarn-client mode.

Also can you share the workflow.xml you are using?

shariyar_murtaz · ‎08-11-2016

yarn-cluster

<workflow-app name="${wf_name}" xmlns="uri:oozie:workflow:0.4">
  <start to="spark"/>
  <action name="spark">
  <spark xmlns="uri:oozie:spark-action:0.1">
  <job-tracker>${job_tracker}</job-tracker>
  <name-node>${name_node}</name-node>
  <master>${master}</master>
  <mode>cluster</mode>
  <name>logminer</name>
  <class>logminer.main.LogMinerMain</class>
  <jar>${filesystem}/${baseLoc}/oozie/lib/argus-logminer-1.0.jar</jar>
  <spark-opts>--driver-memory 4G --executor-memory 4G --num-executors 3 --executor-cores 5</spark-opts>
  <arg>-logtype</arg> <arg>adraw</arg>
  <arg>-inputfile</arg> <arg>/user/inputfile-march-3.txt</arg>
  <arg>-configfile</arg> <arg>${filesystem}/${baseLoc}/oozie/logminer.properties</arg>
  <arg>-mode</arg> <arg>test</arg>
  </spark>
  <ok to="success_email"/>
  <error to="fail_email"/>
  </action>
  <action name="success_email">
  <email xmlns="uri:oozie:email-action:0.1">
  <to>${emailTo}</to>
  <cc>${emailCC}</cc>
  <subject>${wf_name}: Successful run at ${wf:id()}</subject>
  <body>The workflow [${wf:id()}] ran succesfully.</body>
  </email>
  <ok to="end"/>
  <error to="fail_email"/>
  </action>
  <action name="fail_email">
  <email xmlns="uri:oozie:email-action:0.1">
  <to>${emailTo}</to>
  <cc>${emailCC}</cc>
  <subject>${wf_name}: Failed at ${wf:id()}</subject>
  <body>The workflow [${wf:id()}] failed at [${wf:lastErrorNode()}] with the following message: ${wf:errorMessage(wf:lastErrorNode())}</body>
  </email>
  <ok to="fail"/>
  <error to="fail"/>
  </action>
  <kill name="fail">
  <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
  </kill>
  <end name="end"/>
</workflow-app>

mramasami · ‎08-11-2016

Thanks @Shary M for providing the workflow. Looks like the arguments which we have passed might not be passed to the java application . The way we need to specify the argument in the application as args[0] .. args[n], where arg[0] is the argument passed first in the oozie workflow. In the above one,

arg[0] --> -logtype and
arg[1] --> adraw

You can refer the following examples.

Sample workflow: https://github.com/apache/oozie/blob/master/examples/src/main/apps/spark/workflow.xml
Sample java application : https://github.com/apache/oozie/blob/master/examples/src/main/java/org/apache/oozie/example/SparkFil...

Please let us know if you need more information. If it failing again, please share the snippet of your application also.

shariyar_murtaz · ‎08-11-2016

No arguments are passed correctly. This is how my application is accepting it as I am using

org.apache.commons.cli.BasicParser. I verified it multiple times by printing them inside the application. There is nothing wrong there. Thanks for you help.

Cloudera Community

Support Questions

Spark application fails on slaves when launching from Oozie on Yarn