Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Has anyone tried Spark2 jar execution in Yarn cluster mode through Oozie?

avatar
Rising Star

I'm trying to run the SparkPi example using the example jar in Spark2 and running it through Oozie. Attached are the different configuration files for Oozie:

job-properties.txt

workflow.xml

I've the below directory structure both on local FS and HDFS:

+-~/sparkAction/

+-job.properties

+-workflow.xml

+-lib/

+-spark-examples_2.11-2.0.0.2.5.3.0-37.jar

+-spark-hdp-assembly.jar

When I run this using this command as the yarn user :

oozie job -oozie http://kvs-in-merlin04.int.kronos.com:11000/oozie -config job.properties -run

I'm getting the below error:

java.lang.NoClassDefFoundError: org/apache/spark/sql/SparkSession$
	at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:28)
	at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:559)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	... 7 more

The Oozie launcher successfully starts the SparkPi on yarn so that means there are no permission issues. But the Spark program is not finding the SparkSession class!!!

Please help...

1 ACCEPTED SOLUTION

avatar
Master Mentor

I am not sure if Spark2 is supported via Oozie yet but let's say it does, did you add the spark 2 libraries to Oozie sharelib? That will be your first step.

View solution in original post

7 REPLIES 7

avatar
Master Mentor

I am not sure if Spark2 is supported via Oozie yet but let's say it does, did you add the spark 2 libraries to Oozie sharelib? That will be your first step.

avatar
Rising Star

Hi Artem,

Yes i created a new directory under HDFS and included it in the Oozie libpath as below :

oozie.libpath=/user/oozie/share/lib/spark2

I included all the jars from under this Spark2 installation directory /usr/hdp/2.5.3.0-37/spark2/jars to the above HDFS directory but still it gives me this error:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Application application_1484116726997_0144 finished with failed status
org.apache.spark.SparkException: Application application_1484116726997_0144 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122)
	at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
	at org.apache.spark.deploy.yarn.Client.main(Client.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:738)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:289)
	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:211)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:51)
	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:59)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:497)
	at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:242)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Any ideas about what might be causing this error?

avatar
Expert Contributor

You can see more info from YARN logs

yarn logs -applicationId application_1484116726997_0144

avatar
Master Mentor

Looks like from comments on the following Jira, Spark2 support will arrive with Oozie 5.0 https://issues.apache.org/jira/plugins/servlet/mobile#issue/OOZIE-2767

avatar
Master Mentor

@Shikhar Agarwal Spark2 is not officially supported in HDP via Oozie and it is not implemented in Apache Oozie either. Please consider accepting this answer to close the thread. Sorry it's not much of help here.

avatar
Rising Star

Thanks Artem

avatar
Explorer

Hi Artem, do you have a Hortonworks link stating that Spark2 is not officially supported in HDP via Oozie? I want to implement Spark2 via Oozie, and using HDP 2.6, and it seems from this doc that Spark2 via Oozie (oozie 4.2 in hdp2.6) IS possible. Perhaps the poster didn't copy some libraries or jars to the spark2 sharelib? (again, see link).