Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar

Overview

R is currently supported through livy interpreter by running %livy.sparkr

Environment

I did test the below solution with

  • HDP 2.5.0.0-1245 and Ambari 2.4.0.1
  • HDP 2.5.3.0-37 and Ambari 2.4.2.0

Steps To Follow

1. Install R on all YARN worker nodes

$ yum install R-devel libcurl-devel openssl-devel

2. Confirm R works fine

[root@dkhdp251 zeppelin]# R -e "print(1+1)"
R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
  Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


> print(1+1)
[1] 2

3. Confirm R works fine from Zeppelin

12249-1.png

4. Install the following packages from Zeppelin: devtools, data.table, base64env and knitr

%livy.sparkr
install.packages('data.table', repos = 'http://cran.us.r-project.org')
install.packages('base64enc', repos = 'http://cran.us.r-project.org')
install.packages('knitr', repos = 'http://cran.us.r-project.org')
install.packages('ggplot2', repos = 'http://cran.us.r-project.org')

5. Confirm the packages have been installed

%livy.sparkr
library(data.table)
library(base64enc)
library(knitr)
library(ggplot2)

The above should run and finish returning nothing in output.

6. Run the following code to build the graphics

%livy.sparkr
library(data.table)
library(ggplot2)
library(knitr)
set.seed(42)
# generate sample data
dat <- rbind(data.table(gender="female",value=rnorm(1e4)),
data.table(gender="male",value=rnorm(1e4,2,1))
)
# plot
p1 <- ggplot(dat,aes(x=value,color=gender)) + geom_density()
# save to tmp file
ggsave(filename="/tmp/myplot.png", plot=p1)
# get base64 of the image for display in html
printImageURI<-function(file){
uri=image_uri(file)
file.remove(file)
cat(sprintf("%%html <img width='700' src=\"%s\" />\n", uri))
}
printImageURI("/tmp/myplot.png")

This will generate the following graphics

12271-3.png


2.png
2,724 Views
Comments
avatar
Master Guru

+1 for a nice article! I had to add "library(ggplot2)" in steps 4 and 6 which provides ggplot function.