Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

Spark2 classpath issues with Oozie

avatar
Explorer

Hi there,

 

After installing Spark2 by means of CM parcels I noticed that /etc/spark2/conf/spark-defaults.conf contains the following empty properties

 

spark.hadoop.mapreduce.application.classpath=
spark.hadoop.yarn.application.classpath=

As far as I understand these properties are generated at the time of Spark2 parcel installation by means of SPARK2_ON_YARN-2.2.0.cloudera1.jar!/scripts/common.sh script

 

# Override the YARN / MR classpath configs since we already include them when generating
# SPARK_DIST_CLASSPATH. This avoids having the same paths added to the classpath a second
# time and wasting file descriptors.
replace_spark_conf "spark.hadoop.mapreduce.application.classpath" "" "$SPARK_DEFAULTS"
replace_spark_conf "spark.hadoop.yarn.application.classpath" "" "$SPARK_DEFAULTS"

So, if Oozie is configured to run Spark2 jobs, the following exception happens

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
	at java.lang.Class.getDeclaredMethods0(Native Method)
	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
	at java.lang.Class.getMethod0(Class.java:3018)
	at java.lang.Class.getMethod(Class.java:1784)
	at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

If spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath are removed from the spark-defaults.properties then everything is fine.

 

At the same time spark2-submit.sh works as expected becase it reads /etc/spark2/conf/classpath.txt, but oozie - does not use it.

 

So, could the community or CM developers shed some light on whether spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath properties are really necessary?

 

Who agreed with this topic