Member since
11-09-2017
2
Posts
0
Kudos Received
0
Solutions
11-20-2017
12:57 AM
I have this problem too. No word from Cloudera if and when they will ship Spark 2 RPM packages for CDH 5. I think you could install Spark 2 from Apache Bigtop (or build your own RPM) on an edge node and deploy Spark 2 jobs with Yarn. With Yarn you would not need Spark Worker packages on the worker nodes. Edit: I just tried this with Apache Zeppelin and it seem to work. I took the tar.gz from spark.apache.org and extracted it on an edge node. Then configured zeppelin-env.sh with the following variables: export HADOOP_USER_NAME=spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
export MASTER=yarn-client
export SPARK_HOME=/opt/spark-2.2.0-bin/hadoop2.6
When I run spark code in Zeppelin I can see that they get executed with Yarn. They can access HDFS files.
... View more