Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudera plans on Spark 2.0.0

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

(This doesn't sound related to this thread?)

Re: Cloudera plans on Spark 2.0.0

New Contributor

Sorry. I posted the question here because on 1st page it states: 'CDH is already effectively on 1.6.2'. And the bug should be fixed in 1.6.2.

 

I'll create a new topic for this.

Re: Cloudera plans on Spark 2.0.0

Explorer

Hi

 

You strictly don't need to wait for Cloudera to release Spark 2.0.0. Since Spark can be run as a YARN application it is possible to run a Spark version other than the one that comes bundled with the Cloudera distribution. This requires no administrator privileges and no changes to the cluster configuration and can be done by any user who has permission to run a YARN  job on the cluster. A YARN application ships over all it’s dependencies over to the cluster for each invocation. You can run multiple Spark versions simultaneously on a YARN cluster. Each version of Spark is self contained in in the user workspace on the Edge node. Running a new Spark version will not affect any other jobs running on your cluster.

 

Essentially you download and extract spark-2.0.0 on the edge node, copy over your existing cluster configuration and hive-site.xml to the configuration directory and run spark-shell from the new location. This should work out of the box. There are a few optional configuration tweaks, you will find detailed instructions on how to do this if you google for them

 

Though note that jobs running on Spark 2.0.0 may not be supported by Cloudera till the official release.

 

Regards

Deenar

 

 

Re: Cloudera plans on Spark 2.0.0

Explorer

@DeenarT: Question is not whether you can point to a spark distro rather it's natively available or not. On a side note, Spark has released 2.0.0. Not sure if you meant Cloudera's version.

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

PS Spark 2 is now available as a 'beta' add-on for CDH 5.x, such that you can use both 2.0 and 1.6 on the same cluster.

 

http://www.cloudera.com/documentation/betas/spark2/latest/topics/spark2_release_notes.html

Re: Cloudera plans on Spark 2.0.0

New Contributor

Hello Sean,

 

any idea when we might get a Spark 2.0 GA for CDH?

 

kind regards

Geert

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

It was released about a month ago: http://www.cloudera.com/downloads/spark2/2-0.html

Highlighted

Re: Cloudera plans on Spark 2.0.0

New Contributor

We are using Hive-on-Spark. Is there a documentation on how to use Spark2 for hive-on-spark?

 

Regards,

Vijay