Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cloudera plans on Spark 2.0.0

Cloudera plans on Spark 2.0.0

Explorer

Hello,

 

I dont see any updates with regards to Spark 2.0.0 on the product matrix. Since Spark 2.0.0 is here, I'm wondering when Cloudera plans to release support for Spark 2.0.0. As of 5.8, I can only see 1.6.0 , another question is whether Cloudera is planning to bump it to 2.0.0 instead of 1.6.1 and then to 2.0.0.

 

Could someone please address this.

 

Thanks,

RK

17 REPLIES 17

Re: Cloudera plans on Spark 2.0.0

Master Collaborator
No formal announcement but as you can imagine it can't be long before it is available.

A major release can't generally include breaking changes and Spark 2 makes breaking changes. The base Spark has to stay 1.6 but doesn't mean 2.0 can't also be available optionally.

CDH is already effectively on 1.6.2

Re: Cloudera plans on Spark 2.0.0

Explorer

Thanks.

 

However,

http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_rn_fixed_in_57.html#concep... has the following issue fixed:

  • SPARK-4452 - Shuffle data structures can starve others on the same thread for memory --> The fix is available in Spark 2.0.0

On the other hand,

  • SPARK-13622 - Issue creating level db for YARN shuffle service --> is fixed in both 1.6.2 / 2.0.0

 

Did Cloudera port the new patches without actual release of Spark 2.0.0? 

 

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

Yes, the CDH maintenance patch set will always be potentially different for any project, including backporting fixes as appropriate even if whoever merged it upstream didn't backport it into a corresponding upstream branch. Likewise it's possible that an upstream project merges a change into a maintenance branch when it probably wasn't the right thing to do, and CDH would not do so.

Re: Cloudera plans on Spark 2.0.0

Explorer

I'm on 

5.8.0-1.cdh5.8.0.p0.42

The spark version is 1.6.0.

 

$ spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
/_/

 

$ spspark-shell --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.6.0
/_/


How do I upgrade to 1.6.2 like you said on earlier post?

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

The version will always be x.y.0 even though it contains patches on top of x.y.0. It would probably be a little nicer if this version were like "1.6.0-CDH-5.7.1" or whatever, because that's how the Maven artifacts are named, specificially to be clear that it's not the same as upstream x.y.0. The list of exact patches is always available in the release notes, but I'm saying that you will already have a version here with many changes on top of upstream 1.6.0, which ought to be like (but not necessarily identical to) the patches between 1.6.0 and 1.6.2.

Re: Cloudera plans on Spark 2.0.0

Explorer

ok @srowenThank you for the explanation.

Re: Cloudera plans on Spark 2.0.0

New Contributor

Hello,

 

I hear CDH 5.9 will come with Spark 2.0.0

 

What's the ETA for CDH 5.9?

 

Thanks

Re: Cloudera plans on Spark 2.0.0

Master Collaborator

It's alluded to here: http://vision.cloudera.com/enhanced-streaming-and-machine-learning-with-apache-spark-2-0/ but it will be available as a 'beta' soon, though I have not heard anything more specific than that, though I also think it can't be long. The installed Spark in CDH 5.x must stay on Spark 1.x because Spark 2.x is not backwards-compatible, but it's possible to also have Spark 2.x installed and available alongside it. You can imagine that this will be the nature of how it's distributed, so not necessarily tied to the core CDH distribution.

Re: Cloudera plans on Spark 2.0.0

New Contributor

Hi,

 

We encountered the dead-lock error described in the issue: https://issues.apache.org/jira/browse/SPARK-13566. It should be fixed in 1.6.2. Can you tell me in which CDH-version is this fix included? We're using 'cdh5.9.0.p0.23'. Thx.

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here