Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark2 classpath issues with Oozie

Highlighted

Spark2 classpath issues with Oozie

New Contributor

Hi there,

 

After installing Spark2 by means of CM parcels I noticed that /etc/spark2/conf/spark-defaults.conf contains the following empty properties

 

spark.hadoop.mapreduce.application.classpath=
spark.hadoop.yarn.application.classpath=

As far as I understand these properties are generated at the time of Spark2 parcel installation by means of SPARK2_ON_YARN-2.2.0.cloudera1.jar!/scripts/common.sh script

 

# Override the YARN / MR classpath configs since we already include them when generating
# SPARK_DIST_CLASSPATH. This avoids having the same paths added to the classpath a second
# time and wasting file descriptors.
replace_spark_conf "spark.hadoop.mapreduce.application.classpath" "" "$SPARK_DEFAULTS"
replace_spark_conf "spark.hadoop.yarn.application.classpath" "" "$SPARK_DEFAULTS"

So, if Oozie is configured to run Spark2 jobs, the following exception happens

 

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
	at java.lang.Class.getDeclaredMethods0(Native Method)
	at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
	at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
	at java.lang.Class.getMethod0(Class.java:3018)
	at java.lang.Class.getMethod(Class.java:1784)
	at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
	at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

If spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath are removed from the spark-defaults.properties then everything is fine.

 

At the same time spark2-submit.sh works as expected becase it reads /etc/spark2/conf/classpath.txt, but oozie - does not use it.

 

So, could the community or CM developers shed some light on whether spark.hadoop.mapreduce.application.classpath and spark.hadoop.yarn.application.classpath properties are really necessary?