Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Zeppelin + SparkR

avatar

does zeppelin support SparkR interpreter?

1 ACCEPTED SOLUTION

avatar
Rising Star

ok here is the latest, The R Interpreter for Zeppelin has not been merged yet with the latest Zeppelin dist. however you can use it now from here https://github.com/apache/incubator-zeppelin/pull/208. All the Best 🙂

View solution in original post

14 REPLIES 14

avatar

As far as I understood this R interpreter PR is not sharing same SparkContext yet. I've already created a notebook that declares a function in Scala, another in Python and use SQL interpreter to call both functions in a single statement along with a custom java hive-udf. If we could add a function in R, it would be really nice.

avatar
Master Mentor

avatar
Master Mentor

avatar
Rising Star

Zeppelin since version 0.6, has provided support for R Interpreter. By default, the R Interpreter appears as two Zeppelin Interpreters, %r and %knitr. To run Zeppelin with the R Interpreter, some environment variables must be set:

  • R (3.0+)
  • JAVA_HOME (Oracle JDK 1.7+)
  • SPARK_HOME (The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark.You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.)

Then, clone the Zeppelin repository and build it with options:

>git clone https://github.com/apache/zeppelin.git 

# enable the "r" and "sparkr" profile 
>mvn clean install -e -DskipTests -Dspark.version={spark_version} -Dhadoop.version={hadoop_version} -Pr -Psparkr -Pvendor-repo -Pexamples -Drat.skip=true -Dcheckstyle.skip=true -Dcobertura.skip=true 

And next, install SparkR package. In Spark 1.6 or earlier, the SparkR need to manually install.

>cd $SPARK_HOME 
>./R/install-dev.sh 

Now, you can starting Apache Zeppelin with command line:

>cd $ZEPPELIN_HOME 
>bin/zeppelin-daemon.sh start  -e

After successful start, visit http://localhost:8080 with your web browser. And you can execute commands as in the CLI.

6248-r-liner-regression.png

avatar
Rising Star

What is the R environment variable that needs setup?