Reply
New Contributor
Posts: 3
Registered: ‎11-17-2017
Accepted Solution

How to install spark2 on rpm installed cluster

Hi All

 

I am being asked install spark2 on CDH5.8 cluster. and The CDH5.8 Cluster was setup via rpm packages. Here I want to know How can I install spark2 on existing cluster. Anyone has same questions or experiences like me? Thanks a lot.

 

Note: I checked CDH documents and found I can install spark2 via parcel, but seems parcel is conflict with rpm package install.

New Contributor
Posts: 2
Registered: ‎11-09-2017

Re: How to install spark2 on rpm installed cluster

[ Edited ]

I have this problem too. No word from Cloudera if and when they will ship Spark 2 RPM packages for CDH 5.

I think you could install Spark 2 from Apache Bigtop (or build your own RPM) on an edge node and deploy Spark 2 jobs with Yarn. With Yarn you would not need Spark Worker packages on the worker nodes.

 

Edit:

I just tried this with Apache Zeppelin and it seem to work. I took the tar.gz from spark.apache.org and extracted it on an edge node. Then configured zeppelin-env.sh with the following variables:

 

export HADOOP_USER_NAME=spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
export MASTER=yarn-client
export SPARK_HOME=/opt/spark-2.2.0-bin/hadoop2.6

 

When I run spark code in Zeppelin I can see that they get executed with Yarn. They can access HDFS files. 

New Contributor
Posts: 3
Registered: ‎11-17-2017

Re: How to install spark2 on rpm installed cluster

@MarkusH. Thank you. I've work around for that and tried that it works  well so far.

 

First, we have to migirate CDH from package to parcel. migrate CDH from package to Parcel

 

Second, we install spark2. 

https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_mc_addon_services.html

 

Announcements