Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to install spark2 on rpm installed cluster

avatar
Expert Contributor

Hi All

 

I am being asked install spark2 on CDH5.8 cluster. and The CDH5.8 Cluster was setup via rpm packages. Here I want to know How can I install spark2 on existing cluster. Anyone has same questions or experiences like me? Thanks a lot.

 

Note: I checked CDH documents and found I can install spark2 via parcel, but seems parcel is conflict with rpm package install.

1 ACCEPTED SOLUTION

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
New Contributor

I have this problem too. No word from Cloudera if and when they will ship Spark 2 RPM packages for CDH 5.

I think you could install Spark 2 from Apache Bigtop (or build your own RPM) on an edge node and deploy Spark 2 jobs with Yarn. With Yarn you would not need Spark Worker packages on the worker nodes.

 

Edit:

I just tried this with Apache Zeppelin and it seem to work. I took the tar.gz from spark.apache.org and extracted it on an edge node. Then configured zeppelin-env.sh with the following variables:

 

export HADOOP_USER_NAME=spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
export MASTER=yarn-client
export SPARK_HOME=/opt/spark-2.2.0-bin/hadoop2.6

 

When I run spark code in Zeppelin I can see that they get executed with Yarn. They can access HDFS files. 

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login