Support Questions

Kiku · ‎10-26-2016

Hi,

With HDP2.5 version supporting both spark versions - i.e. 1.6.2 & 2.0 (TP) - has anyone tried running both versions of spark code (not necessarity same spark application) on the same cluster? Are there any lessons learnt? I heard there are problems when both versions of the code executed on the same cluster but does not have further details. Just wondering if anyone has run some tests/evaluating 2.0 version usage for building future workloads but no immediate urge to convert all the current workloads to be migrated from oldversion to 2.0.

Let me know your views.

Cheers,

KK.

jlrowe · ‎11-06-2016

I'm just about to try this myself so will be watching for answers, I'm guessing we may need to run 2 history servers. Don't forsee any other problem.

View solution in original post

melek · ‎10-27-2016

Do you mean a standalone spark cluster or yarn cluster?

Kiku · ‎10-29-2016

Yarn cluster

jlrowe · ‎11-06-2016

I'm just about to try this myself so will be watching for answers, I'm guessing we may need to run 2 history servers. Don't forsee any other problem.

jlrowe · ‎11-09-2016

Just installed spark 2.0.1 on HDP 2.4 with spark 1.6.0 and it works just fine

You need a 2nd history server (and hdfs dir)

Kiku · ‎11-07-2016

Hi Jonathan,

Please let me know your findings. I will share mine if I progress on it. I have a test workload as well to run it on my test cluster. Need to upgrade to HDP2.5 before I could run the tests. Currently snowed under at work so not finding time to do it. It is one of my list of things to do.

Cheers,

KK

anatva · ‎02-06-2017

I have tried to run the KMeans clustering code built for spark 1.6 on the spark 2.0, and came across problems with data types of Vectors. This could be due to spark 2.0 using vectors from "ml" and not "mllib" but the KMeans fit method seems to be calling some of mllib functions under the hood, and these functions dont understand the "ml" vectors. I ll post here once I get work around to this issue.

anatva · ‎02-07-2017

Clustering algorithms in Spark 2.0 use ML version so, the feature vectors need to be of type ML instead of MLLIB. I had to convert an MLLIB vector to ML vector to make it work in spark 2.0.

anatva · ‎02-07-2017

Clustering algorithms in Spark 2.0 use ML version so, the feature vectors need to be of type ML instead of MLLIB. I had to convert an MLLIB vector to ML vector to make it work in spark 2.0.

Cloudera Community

Support Questions

Learnings on running Spark 1.6 & 2.0 code base on same cluster