Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

New Contributor

In Zeppelin, when I type: sc.version it comes back 1.6.2.

But I would like to run Spark 2.0 in Zeppelin instead. Is there a way to set this up??

17 REPLIES 17
Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

@Mark Ott

Yes, it is possible to run Spark 2.0 within HDP 2.5 (although it is in tech preview).

Here's a great article by Paul Hargis that'll walk you through the process of setting up Spark 2.0: https://community.hortonworks.com/articles/53029/how-to-install-and-run-spark-20-on-hdp-25-sandbox.h...

Actually, in the HDP 2.5, spark 2.0 comes as part of the environment. It is located at /usr/hdp/2.5.0.0-1234/spark2.

To use this within Zeppelin, you will need to set the environment variables SPARK_HOME=/usr/hdp/2.5.0.0-1234/spark2-client and SPARK_MAJOR_VERSION=2.

You can set this by following the instructions within the Zeppelin documentation: https://zeppelin.apache.org/docs/latest/interpreter/spark.html

You can find the zeppelin-env.sh at a location similar to this: /usr/hdp/2.5.0.0-1234/zeppelin/conf/zeppelin-env.sh

Here's an additional link that may be helpful: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/spark-choo...

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

New Contributor

I looked over the article but it does not mention how to configure Zeppelin for Spark 2.0.

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

@Mark Ott

The link above will show how to install Spark2 within you environment. Actually, in the HDP 2.5, spark 2.0 comes as part of the environment. It is located at /usr/hdp/2.5.0.0-1234/spark2.

To use this within Zeppelin, you will need to set the environment variables SPARK_HOME and SPARK_MAJOR_VERSION.

You can set this by following the instructions within the Zeppelin documentation: https://zeppelin.apache.org/docs/latest/interpreter/spark.html

You can find the zeppelin-env.sh at a location similar to this: /usr/hdp/2.5.0.0-1234/zeppelin/conf/zeppelin-env.sh

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

This could also be causing you some trouble... There was a bug with the Spark Interpreter, which was fixed as part of Zeppelin version 0.6.2.

https://issues.apache.org/jira/browse/ZEPPELIN-1390

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

@Mark Ott

Please follow these steps

1) Remove SPARK_HOME from zeppelin-env.sh

2) On Zeppelin UI, go onto zeppelin interpreters page and edit spark interpreter, add property SPARK_HOME = /usr/hdp/current/spark-client

3) On Zeppelin UI, in zeppelin interpreters home page only - create a new interpreter whose base interpreter group is spark. Name this interpreter as spark2.

4) Edit this interpreter spark2 by adding property SPARK_HOME=/usr/hdp/current/spark2-client

5) Now create a notebook and run following paragraph

%spark2

sc.version

It should point to spark2 version

Hope this helps

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

New Contributor

Hi

It's not working for me. I'm getting:

	%spark2
	sc.version
Prefix not found.

Any suggestions?

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

@Mikolaj Habdank-Wojewodzki Sorry for a very delayed response. I made a mistake. Try this

%spark2

spark.sparkContext.version

This should print the spark2 version

Basically while using spark2 interpreter, your entry point in the spark code is 'spark.sparkContext' and not 'sc'. 'sc' is for Spark1

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

New Contributor

and after restart of zeppelin I'm getting this now:

ERROR [2017-02-09 11:16:36,679] ({qtp290658609-69} NotebookServer.java[afterStatusChange]:1136) - Error org.apache.zeppelin.interpreter.InterpreterException: Interpreter spark2 not found

Highlighted

Re: Is it possible to run Spark 2.0 version in Zeppelin on HDP 25 sandbox

@Mikolaj Habdank-Wojewodzki Are you getting this exception while running your already existing notebook? In that case, can you please check if your newly created spark2 interpreter has been bound to your existing notebook? You will see the button called 'Interpreter Binding' on the upper right corner of the notebook.

Don't have an account?
Coming from Hortonworks? Activate your account here