Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar

ENVIRONMENT

  • HDP-2.6.0.3
  • Ambari 2.5.0.3

SOLUTION

1. Install R on each DN

$ yum install R-devel libcurl-devel openssl-devel

2. Run on each DN

$ R
> install.packages("knitr")

3. Test R from CLI

[root@dghdp255 ~]# R -e "print(1+1)"
R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> print(1+1)
[1] 2
>
>
[root@dghdp255 ~]#

4. Zeppelin UI

a) spark2 config
SPARK_HOME /usr/hdp/current/spark2-client/
args
master yarn-client
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory
spark.yarn.keytab /etc/security/keytabs/zeppelin.server.kerberos.keytab
spark.yarn.principal zeppelin-emeasupport@HWX.COM
zeppelin.R.cmd R
zeppelin.R.image.width 100%
zeppelin.R.knitr true
zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F
zeppelin.dep.additionalRemoteRepository spark-packages,http://dl.bintray.com/spark-packages/maven,false;
zeppelin.dep.localrepo local-repo
zeppelin.interpreter.localRepo /usr/hdp/current/zeppelin-server/local-repo/2CHXWU7YZ
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.importImplicit true
zeppelin.spark.maxResult 1000
zeppelin.spark.printREPLOutput true
zeppelin.spark.sql.stacktrace false
zeppelin.spark.useHiveContext true

b) test R from zeppelin UI

16014-test-r.png

c) create a test CSV file on the OS (zeppelin node)

[root@dghdp254 ~]# ls -lrt /tmp/updated.csv
-rw-r--r--. 1 root root 1326 Jun  6 07:07 /tmp/test.csv

d) check reading the file from R CLI

[root@dghdp254 ~]# R
R version 3.3.3 (2017-03-06) -- "Another Canoe"
Copyright (C) 2017 The R Foundation for Statistical Computing
Platform: x86_64-redhat-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> a<-read.csv("/tmp/test.csv")
> print(a)
[1] Test.File
<0 rows> (or 0-length row.names)
>

e) restart spark2 interpreter and run the below

%spark2.r
a<-read.csv("/tmp/test.csv")
print(a)

16013-spark2-r-read-file.png

1,525 Views