Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Local Spark Development against a remote cluster

Highlighted

Local Spark Development against a remote cluster

Contributor

What is the best way to develop Spark applications on your local computer? I'm using IntelliJ and trying to set the master, just for debugging purposes, to my remote HDP cluster so I can test code against Hive and other resources on my cluster. I'm using HDP 2.5.3 and I've added the spark libraries for scala 2.10 and spark 1.6.2 from the maven repository. I've set my build.sbt scalaVersion to 2.10.5 and added the library dependencies. As far as I can tell, I have the exact same versions that are running in HDP 2.5.3 in my project, but when I try to run the application pointing the SparkConf to my remote spark master I get the following error for an incompatible class:

java.io.InvalidClassException: org.apache.spark.rdd.RDD; local class incompatible: stream classdesc serialVersionUID = 5009924811397974881, local class serialVersionUID = 7185378471520864965

Is there something I'm missing, or is there a better way to develop and test against the remote cluster?

21 REPLIES 21
Highlighted

Re: Local Spark Development against a remote cluster

@Eric Hanson Which repository are you getting the spark libraries from? Use the hortonworks repo. Check out this documentation on how to build spark streaming apps. It can be adapted to SBT an non-streaming apps.

http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_spark-component-guide/content/using-spark...

Also HDP 2.5 includes two different versions of spark. Check the settings for SPARK_HOME. For 1.6 use:

exportSPARK_HOME=/usr/hdp/current/spark-client
export SPARK_MAJOR_VERSION=1
Highlighted

Re: Local Spark Development against a remote cluster

Contributor

Thank you @cduby. I'm using sbt, but maybe I should use a maven project instead. I tried to translate the examples in the link you shared to sbt dependencies and it seemed to work for all of the apache dependencies, but then I get an error for an unresolved dependency on "org.mortbay.jetty#jetty;6.1.26.hwx" and "org.mortbay.jetty#jetty-util;6.1.26.hwx", which I didn't have as a dependency in my project. I tried adding library dependencies in my build.sbt for them, but still get the error. I looked on the repo site and all I found was /org/mortbay/jetty/project/6.1.26.hwx for directory. I don't understand why I have a dependency on this or how to resolve it. Do you know how to resolve this error? I may try creating a maven project and see if I still get the error message. Thanks again for your help.

Highlighted

Re: Local Spark Development against a remote cluster

@Eric Hanson Could you attach your SBT build file?

Highlighted

Re: Local Spark Development against a remote cluster

Contributor

@cduby here's a screenshot

buildsbt.png

Highlighted

Re: Local Spark Development against a remote cluster

@Eric Hanson Do you need the "All Spark Repository"? Can you try removing that repo?

Re: Local Spark Development against a remote cluster

Contributor

@cduby Tried it, but it doesn't change anything

Highlighted

Re: Local Spark Development against a remote cluster

@Eric Hanson Can you send your spark-submit command line?

Highlighted

Re: Local Spark Development against a remote cluster

Contributor

@cduby I haven't tried to do a spark-submit command line yet. I'm just trying to get the program to compile without error right now.

Highlighted

Re: Local Spark Development against a remote cluster

Contributor
@cduby

I created a maven project instead and didn't get the dependency resolution errors, but I'm getting class not found errors when I try to run the program. Is the only way to run your code on a cluster to do a spark-submit? I was really wanting to just debug my code as I add to it to make sure it behaves as expected before and then move it to the cluster and submit it as a job

Don't have an account?
Coming from Hortonworks? Activate your account here