Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Which CDH release will include Spark 1.4.x?

Solved Go to solution

Which CDH release will include Spark 1.4.x?

Rising Star

Does anyone which CDH release will include/support Spark 1.4?

If so, any timetable?

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Which CDH release will include Spark 1.4.x?

Master Collaborator

If it were me, I'd download the source for 1.4.0 and build for the exact CDH artifacts to be safest. See http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html  Then just try running the local copy of bin/spark-shell etc from that distribution. You need to use YARN masters. I won't 100% guarantee that works but see no reason it wouldn't. The build flags are probably like -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.3 -Pyarn

6 REPLIES 6

Re: Which CDH release will include Spark 1.4.x?

Master Collaborator

Presumably CDH 5.5, since a new minor release is needed to update a minor release of a component in general. There aren't timeframes for this, but you can see CDH is typically on a 4-6 month minor release cycle and 5.4 was out 2 months ago.

Re: Which CDH release will include Spark 1.4.x?

Master Collaborator

PS I should say too that you should be able to use 1.4 with CDH 5.4 and have it generally work; this requires a little bit of understanding of how to get a build on a machine and run from that build, but otherwise it's a YARN app and modulo some dependency issues at the edge maybe, should just work.

Re: Which CDH release will include Spark 1.4.x?

Rising Star

That's even better!

I could give a try, at least for my Lab environment.

Could you please provide some info, links, docs, blogs how this could be done?

 

Re: Which CDH release will include Spark 1.4.x?

Master Collaborator

If it were me, I'd download the source for 1.4.0 and build for the exact CDH artifacts to be safest. See http://spark.apache.org/docs/latest/hadoop-third-party-distributions.html  Then just try running the local copy of bin/spark-shell etc from that distribution. You need to use YARN masters. I won't 100% guarantee that works but see no reason it wouldn't. The build flags are probably like -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.4.3 -Pyarn

Highlighted

Re: Which CDH release will include Spark 1.4.x?

New Contributor

I had tried this but seemed to have some trouble in using things like pyspark, etc. - is there a gist or something somewhere with exact steps for CDH?

I will try again and post what I did.

Re: Which CDH release will include Spark 1.4.x?

Rising Star

Greta, thank you for your quick response!

Hoping to have CDH 5.5 released sooner than 4-5 months :-)