- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 07-18-2018 05:58 PM
Short Description:
In this tutorial we will install the browser connected version of RServer on to the HDP Sandbox.
Article
Installing RStudio on HDP Sandbox
Introduction
RStudio is an Integrated Development Environment (IDE) for the R language which includes a direct code execution console, as well as tools for plotting and debugging, you can find more information about the RStudio features here.
RStudio is used as the primary tool in the Predicting Airline Delays using SparkR Hortonworks tutorial in which you will learn to train and analyze Machine Learning models to predict Airline delays.
Prerequisites
1. SSH on to the Sandbox
Use the following command to SSH on to the Sandbox as root user:
ssh root@sandbox-hdp.hortonworks.com -p 2222
NOTE: If this is your first time signing on the default password is hadoop.
2. Begin installation
On CentOS 7, the base OS for the Sandbox, R is available through the Extra Packages for Enterprise Linux (EPEL) package, so we will install it first.
yum install epel-release
Next, update yum
yum update -y
3. Install R and RStudio
Let us begin by installing R:
yum install R -y
Now we may install RStudio Server:
wget https://download2.rstudio.org/rstudio-server-rhel-1.1.456-x86_64.rpm sudo yum install rstudio-server-rhel-1.1.456-x86_64.rpm
NOTE: You can find the newest RStudio release here under Redhat/CentOS 64bit.
Finally verify that the server is up and running:
systemctl status rstudio-server.service
You should see a message stating that the server is active.
4. Assigning a Different Port for RStudio
Install dpkg to divert the location of /sbin/initctl and assign a different port for Rstudio:
yum install -y dpkg
dpkg-divert --local --rename --add /sbin/initctl
ln -s /bin/true /sbin/initctl
By default RStudio accepts connections on port 8787; however, the Sandbox uses this port for another service, so we must assign the server a different port (In our case we will use port 60000).
echo "www-port=60000" | sudo tee -a /etc/rstudio/rserver.conf
The next command will restart the server:
NOTE: The command will end the SSH connection to the Sandbox, do not panic, this is expected.
exec /usr/lib/rstudio-server/bin/rserver
5. Begin using RStudio
Open a web browser and navigate to:
http://sandbox-hdp.hortonworks.com:60000
You should see a Sign in Screen for RStudio:
Your Username is amy_ds and the password is amy_ds.
Congratulations! You may now start using RStudio along with the tools included in the HDP Sandbox for an enhanced Data Science experience.
Summary
In this tutorial we learned how to install RStudio and change the configuration file for the server to change the default RStudio port to avoid conflicts on our sandbox.
Further Reading
You can go to following links to explore tutorials using RStudio: