In this tutorial we will install the browser connected version of RServer on to the HDP Sandbox.
Article
Installing RStudio on HDP Sandbox
Introduction
RStudio is an Integrated Development Environment (IDE) for the R language which includes a direct code execution console, as well as tools for plotting and debugging, you can find more information about the RStudio features here.
RStudio is used as the primary tool in the Predicting Airline Delays using SparkR Hortonworks tutorial in which you will learn to train and analyze Machine Learning models to predict Airline delays.
NOTE: You can find the newest RStudio release here under Redhat/CentOS 64bit.
Finally verify that the server is up and running:
systemctl status rstudio-server.service
You should see a message stating that the server is active.
4. Assigning a Different Port for RStudio
Install dpkg to divert the location of /sbin/initctl and assign a different port for Rstudio:
yum install -y dpkg
dpkg-divert --local --rename --add /sbin/initctl
ln -s /bin/true /sbin/initctl
By default RStudio accepts connections on port 8787; however, the Sandbox uses this port for another service, so we must assign the server a different port (In our case we will use port 60000).
echo "www-port=60000" | sudo tee -a /etc/rstudio/rserver.conf
The next command will restart the server:
NOTE: The command will end the SSH connection to the Sandbox, do not panic, this is expected.
Your Username is amy_ds and the password is amy_ds.
Congratulations! You may now start using RStudio along with the tools included in the HDP Sandbox for an enhanced Data Science experience.
Summary
In this tutorial we learned how to install RStudio and change the configuration file for the server to change the default RStudio port to avoid conflicts on our sandbox.
Further Reading
You can go to following links to explore tutorials using RStudio: