Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Multiple Spark version on the same cluster

avatar
New Contributor

Is there a workaround to install multiple spark versions on the same cluster for different usage?

 

one of the products I want to use has compatibility issue with Spark 1.5 and it is only compatible with 1.3, so I need to install both versions 1.5 & 1.3 , is there a way to achieve this ?

1 ACCEPTED SOLUTION

avatar
Rising Star

CM is supporting single version for Spark on YARN and single version for Standalone installation (Single version is  common requirement). 

 

For supporting multiple versions of Spark you need to install it manually on a single node and copy the config files for YARN and Hive inside its conf directory. And when you refer the spark-submit of that version, it will distribute the Spark-core binary on each YARN nodes to execute your code. You don't need to install Spark on each YARN nodes.

View solution in original post

12 REPLIES 12

avatar
Explorer

Yup, people should already be very carefull about it.

 

On the other hand, there are people with older CDH version with no Spark2 support available, or just trying to figure out if a vanilla(newer) version of spark has some bug(s) fixed, or whatever any other reason that works for them.

 

Regards.

avatar
Explorer

However this is a good explanation on how to run multiple spark installations on the same CDH, just adapting to other versions, so it's very valuable. 

 

One point though, does anything change kerberos-wise? I have done the same on different clusters, installing 1.6.3 into one non-kerberized CDH5.4 (Spark 1.3) and a kerberized CDH 5.5.3 (Spark 1.5.0).

Doing the same steps as in the non-kerberos installation (and issuing a ticket that allows me to spark-submit application with regular installed CDH Spark version), it fails like this: 

 

Failed to renew token: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:nameservice1, Ident: (HDFS_DELEGATION_TOKEN xxxx for yyyy)

 

Could it be completed including steps necessary in a Kerberized installation? Thanks

avatar

How do I query hive tables from spark 2.0 . Could you share the steps.