Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to get the spark support with 4.0.0-cdh5.3.2 oozie

Solved Go to solution

how to get the spark support with 4.0.0-cdh5.3.2 oozie

Contributor

HI

    As you know, there is not supporting spark with  4.0.0-cdh5.3.2 oozie in cdh5.3.2.  

    But, we would like to get the function of workflow support. 

    How to resolve the issue in our cdh5.3.2 environment?

   

Thanks

Paul

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Super Collaborator

To rule out that we have a custom jar issue can you run the pi example to make sure that the cluster is (not) setup correctly?

We have documented how to run a spark application, with the example in our docs.

 

The error that you show points to a classpath error and you can not find the Spark classes on your class path.

 

WIlfred

6 REPLIES 6

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Super Collaborator

The only way to use Spark when you do not have a Spark action is to use the shell based action and create the proper spark-submit command for it.

You will need to make sure that the configuration and classpath etc is set from the action.

 

Wilfred

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Contributor
HI Wilfred, I will to try follow your suggestion Thanks Paul

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Contributor
HI Wilfred
Could you give me an example?
Thanks in advance.
Paul

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Super Collaborator

Whatever you use as a spark-submit from the command line is what you use in the oozie shell action.

Make sure that you have the proper gateway for Spark and YARN installed on the oozie server so it has the configuration needed.

 

The rest works as if you have a standard oozie shell action (i.e. create the workflow, properties and shell script files) and place the files on the machine/hdfs so they can be found.

 

Wilfred

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Contributor
Hi Wilfred
I have installed Spark Gateway, and yarn was already be installed with oozie. Unfortunately, I run the shell:
spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --master yarn target/sparkwordcount-0.0.1-SNAPSHOT.jar /user/paul 2
got the error:
Exception in thread "Driver" scala.MatchError: java.lang.NoClassDefFoundError: org/apache/hadoop/mapred/JobConf(of class java.lang.NoClassDefFoundError)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:432)
How to resolve the issue?
Thanks in advance.
Paul
Highlighted

Re: how to get the spark support with 4.0.0-cdh5.3.2 oozie

Super Collaborator

To rule out that we have a custom jar issue can you run the pi example to make sure that the cluster is (not) setup correctly?

We have documented how to run a spark application, with the example in our docs.

 

The error that you show points to a classpath error and you can not find the Spark classes on your class path.

 

WIlfred

Don't have an account?
Coming from Hortonworks? Activate your account here