Support Questions
Find answers, ask questions, and share your expertise

Oozie work flow not finding main class

Explorer

Hello Community,

 

I am going through this tutorial https://www.cloudera.com/tutorials/setting-up-a-spark-development-environment-with-java.html and I am trying to submit it through Oozie work flow. In Hue I am going to Query > Scheduler > Workflow then dropping Java program to actions and then uploading jar file then adding the main class which is Hortonworks.SparkTutorial.Main. When I click to run the Oozie work flow job I keep getting the error of: "Caused by: java.lang.ClassNotFoundException: Class Hortonwork.SparkTutorial.Main not found". I'm using intelliJ to do this project so I hold down ctrl and hover over Main and it takes me to MANIFEST.MF file and it says Main-Class: Hortonworks.SparkTutorial.Main so I'm getting the main class definition right I feel like. I cannot figure out why it is saying it can't find my class.

 

 

10 REPLIES 10

Expert Contributor

Hi @jarededrake , The "ClassNotFoundException: Class Hortonwork.SparkTutorial.Main not found" suggests that in the Java program's main class package name might have a typo (in your workflow definiton), the Hortonwork should be Hortonworks. Can you check that?

Explorer

@mszurap You were right I didn't have the "s" at the end 😐. However I am still getting the problem on oozie, I've add a screen shot so you can see the structure of my program and you can see in the MANIFEST.MF file it has my Main class as 

Hortonworks.SparkTutorial.Main
Capture9.PNG

Expert Contributor

I see. Have you verified that the built jar contains this package structure and class names? Can you also show where the jar is uploaded and how is it referenced in the oozie workflow?

Thanks, Miklos

Explorer

I can yes I will get those screen shots to you

Explorer

So this is interesting, I run "mvn package" to package my application into a jar and I get two different jars I get a SparkTutorial.jar. When I look at the contents of that jar file I only see my dependencies but I do not see my main class 

 

I run: jar tf C:\Users\drakej\Desktop\SparkTutorial\out\artifacts\SparkTutorial_jar\SparkTutorial.jar.

 

Sample of the output is below, it only lists my dependencies. 

 

org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyRequestProto$Builder.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyRequestProto.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyRequestProtoOrBuilder.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyResponseProto$1.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyResponseProto$Builder.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyResponseProto.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos$TransitionToStandbyResponseProtoOrBuilder.class
org/apache/hadoop/ha/proto/HAServiceProtocolProtos.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$1.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveRequestProto$1.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveRequestProto$Builder.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveRequestProto.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveRequestProtoOrBuilder.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveResponseProto$1.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveResponseProto$Builder.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveResponseProto.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$CedeActiveResponseProtoOrBuilder.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$GracefulFailoverRequestProto$1.class
org/apache/hadoop/ha/proto/ZKFCProtocolProtos$GracefulFailoverRequestProto$Builder.class

 

The second jar file that is made is SparkTutorial-1.0-SNAPSHOT.jar, I run 

 

jar tf C:\Users\drakej\Desktop\SparkTutorial\target\SparkTutorial-1.0-SNAPSHOT.jar

Then there it has my class listed, however when I run this jar file in oozie I get the error:

 

org.apache.oozie.action.hadoop.JavaMainException: java.lang.NoClassDefFoundError: org/apache/spark/SparkConf

 So one jar has my dependencies and another jar has my class but they are not together. 

Explorer

Forgot to add the contents of 

 

jar tf C:\Users\drakej\Desktop\SparkTutorial\target\SparkTutorial-1.0-SNAPSHOT.jar

 

META-INF/
META-INF/MANIFEST.MF
Hortonworks/
Hortonworks/SparkTutorial/
code.txt
Hortonworks/SparkTutorial/Main.class
Main.class
replacementValues.properties
shakespeareText.txt
ULAN-Test-IPSummary.csv
ULAN-Test-IPSummary.txt
META-INF/maven/
META-INF/maven/hortonworks/
META-INF/maven/hortonworks/SparkTutorial/
META-INF/maven/hortonworks/SparkTutorial/pom.xml
META-INF/maven/hortonworks/SparkTutorial/pom.properties

Expert Contributor

Hi @jarededrake , sorry for the delay, I was away for a couple of days.

You should use your thin jar (application only - without the dependencies) in the target directory ("SparkTutorial-1.0-SNAPSHOT.jar"). The NoClassDefFoundError for the SparkConf suggests that you've tried a Java action. It is highly suggested to use a Spark action in Oozie workflow editor when running a Spark application to make sure that the environment is set up properly for the application.

Explorer

Hey @mszurap , no problem at all totally understand.  So I ran my "SparkTutorial-1.0-SNAPSHOT.jar" from Hue > Query > Scheduler > Workflow, I then drag down Spark to the work flow, I then fill out the inputs like this: 

Jar/py name: SParkTutorial-10.0.SNAPSHOT.jar

Main class: Main 

Files + : /user/zzmdrakej2/SparkTutorial-1.0-SNAPSHOT.jar

 

I then get this error, its different so I guess that's a good sign right: 

 

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Delegation Token can be issued only with kerberos or web authentication

 

 

Expert Contributor

Hi @jarededrake , that's a good track, the issue currently seems to be that the cluster has Kerberos enabled, and that needs an extra configuration.

In the workflow editor, in the right upper corner of the Spark action you will find a cogwheel icon for advanced settings. There on the Credentials tab enable the "hcat" and "hbase" credentials to let the Spark client obtain delegation tokens for the Hive (Hive metastore) and HBase services - in case the spark application wants to use those services (Spark does not know this in advance, so it obtains those DTs). You can disable this behavior too if you are sure that the Spark applicatino will not connect to Hive (using Spark SQL) or HBase, just add the following to the Spark action option list:

--conf spark.security.credentials.hadoopfs.enabled=false --conf spark.security.credentials.hbase.enabled=false --conf spark.security.credentials.hive.enabled=false

but it's easier to just enable these credentials in the settings page.

For similar Kerberos related issues in other actions, please see the following guide:

https://gethue.com/hadoop-tutorial-oozie-workflow-credentials-with-a-hive-action-with-kerberos/ 

Moderator

@jarededrake Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks!


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
; ;