Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

SOLVED Go to solution
Highlighted

Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

New Contributor

I'm trying to run the SparkPi example using the example jar in Spark2 and running it through Oozie. Attached are the different configuration files for Oozie:

job-properties.txt

workflow.xml

I've the below directory structure both on local FS and HDFS:

+-~/sparkAction/

+-job.properties

+-workflow.xml

+-lib/

+-spark-examples_2.11-2.0.0.2.5.3.0-37.jar

+-spark-hdp-assembly.jar

When I run this using this command as the yarn user :

oozie job -oozie http://kvs-in-merlin04.int.kronos.com:11000/oozie -config job.properties -run

I'm getting the below error:

java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession$
	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:559)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 7 more

The Oozie launcher successfully starts the SparkPi on yarn so that means there are no permission issues. But the Spark program is not finding the SparkSession class!!!

Please help...

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

Mentor

I am not sure if Spark2 is supported via Oozie yet but let's say it does, did you add the spark 2 libraries to Oozie sharelib? That will be your first step.

7 REPLIES 7

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

Mentor

I am not sure if Spark2 is supported via Oozie yet but let's say it does, did you add the spark 2 libraries to Oozie sharelib? That will be your first step.

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

New Contributor

Hi Artem,

Yes i created a new directory under HDFS and included it in the Oozie libpath as below :

oozie.libpath=/user/oozie/share/lib/spark2

I included all the jars from under this Spark2 installation directory /usr/hdp/2.5.3.0-37/spark2/jars to the above HDFS directory but still it gives me this error:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1484116726997_0144 finished with failed status
org.apache.spark.SparkException: Application application_1484116726997_0144 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122)
	at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
	at org.apache.spark.deploy.yarn.Client.main(Client.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:738)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:289)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:211)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:51)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:59)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:242)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Any ideas about what might be causing this error?

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

Expert Contributor

You can see more info from YARN logs

yarn logs -applicationId application_1484116726997_0144

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

Mentor

Looks like from comments on the following Jira, Spark2 support will arrive with Oozie 5.0 https://issues.apache.org/jira/plugins/servlet/mobile#issue/OOZIE-2767

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

Mentor

@Shikhar Agarwal Spark2 is not officially supported in HDP via Oozie and it is not implemented in Apache Oozie either. Please consider accepting this answer to close the thread. Sorry it's not much of help here.

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

New Contributor

Thanks Artem

Re: Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

New Contributor

Hi Artem, do you have a Hortonworks link stating that Spark2 is not officially supported in HDP via Oozie? I want to implement Spark2 via Oozie, and using HDP 2.6, and it seems from this doc that Spark2 via Oozie (oozie 4.2 in hdp2.6) IS possible. Perhaps the poster didn't copy some libraries or jars to the spark2 sharelib? (again, see link).