Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

steps for accessing spark from eclipse IDE to deploy and configure

steps for accessing spark from eclipse IDE to deploy and configure

Expert Contributor

Request you to update me any steps for accessing spark from eclipse IDE to deploy and configure

3 REPLIES 3
Highlighted

Re: steps for accessing spark from eclipse IDE to deploy and configure

Hi Krishna,

I'd recommend you reading this post: https://nicolasmaillard.com/2016/02/06/remote-debugging-201-spark/

Hope that helps.

Highlighted

Re: steps for accessing spark from eclipse IDE to deploy and configure

Yeah I don't think there is a dirt simple way to directly run spark jobs from the IDE.

What I would normally do install the Scala-IDE

a) Set up my project with sbt and make sure it works i.e. if you run a sbt assembly and then submit the job with the jar you get the expected results.

b) run sbt eclipse and import the created project to Scala-IDE

( you could also just create a scala project and import the spark assembly as dependency but then you need to take care of all of dependency jars )

Any changes you know make in Scala will be changed in the file system and you will have all the goodies like auto correction, error highlighting etc. You just need to make the final assembly build

c) Build the assembly jar with sbt command line or create a simple Ant job that does that for you.

Unfortunately I am not aware of an effort to include Spark in the normal Eclipse Launcher config directly I heard IntelliJ might be a bit further But that was always much less important for me than having the actual code build.

d) Kicking off the jobs

One way is to manually scp the jar to the hadoop cluster and submit it there. I do that because I like to develop on my Mac. Or you would need Eclipse installed on a spark client node. Or you could use Eclipse Remote Systems View to work directly in the edge node even if you run Eclipse on your local machine.

( Not saying that the above is the most elegant way of doing it but it works. )

Highlighted

Re: steps for accessing spark from eclipse IDE to deploy and configure

New Contributor

This can be accomplished by:

1. Build the jar (you might need to build a fat jar that will include additional dependncies)

2. When you generate spark context in IDE, specify master address and the location of your jar. Below sample is for Java:

SparkConf conf = new SparkConf()
        .setMaster("spark://MASTER-ADDRESS:7077")
        .setJars(new String[]{"build/libs/spark-test-1.0-SNAPSHOT.jar"})
        .setAppName("HDFSWordCount");

That's it, now you can run/debug your code on your cluster from an IDE. Please note, you will need to build the jar every time you change anything.

Don't have an account?
Coming from Hortonworks? Activate your account here