Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎09-11-2017

Spark2 classpath issues with Oozie

Hi there,

 

After installing Spark2 by means of CM parcels I noticed that /etc/spark2/conf/spark-defaults.conf contains the following empty properties

 

spark.hadoop.mapreduce.application.classpath=
spark.hadoop.yarn.application.classpath=

As far as I understand these properties are generated at the time of Spark2 parcel installation by means of SPARK2_ON_YARN-2.2.0.cloudera1.jar!/scripts/common.sh script

 

# Override the YARN / MR classpath configs since we already include them when generating
# SPARK_DIST_CLASSPATH. This avoids having the same paths added to the classpath a second
# time and wasting file descriptors.
replace_spark_conf "spark.hadoop.mapreduce.application.classpath" "" "$SPARK_DEFAULTS"
replace_spark_conf "spark.hadoop.yarn.application.classpath" "" "$SPARK_DEFAULTS"

So, if Oozie is configured to run Spark2 jobs, the following exception happens

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
	at java.lang.Class.getDeclaredMethods0(Native Method)
	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
	at java.lang.Class.getMethod0(Class.java:3018)
	at java.lang.Class.getMethod(Class.java:1784)
	at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

If spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath are removed from the spark-defaults.properties then everything is fine.

 

At the same time spark2-submit.sh works as expected becase it reads /etc/spark2/conf/classpath.txt, but oozie - does not use it.

 

So, could the community or CM developers shed some light on whether spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath properties are really necessary?

 

Announcements