Member since
06-15-2018
14
Posts
10
Kudos Received
0
Solutions
08-24-2019
08:21 PM
Hi @cfarnes Thanks for reply, I gave 28GB RAM to Virtual box to run CDA.
... View more
07-18-2018
05:58 PM
4 Kudos
Short Description: In this tutorial we will install the browser connected version of RServer on to the HDP Sandbox. Article Installing RStudio on HDP Sandbox Introduction RStudio is an Integrated Development Environment (IDE) for the R language which includes a direct code execution console, as well as tools for plotting and debugging, you can find more information about the RStudio features here. RStudio is used as the primary tool in the Predicting Airline Delays using SparkR Hortonworks tutorial in which you will learn to train and analyze Machine Learning models to predict Airline delays. Prerequisites Download the latest HDP Sandbox Complete the Learning the Ropes of the Hortonworks Sandbox tutorial 1. SSH on to the Sandbox Use the following command to SSH on to the Sandbox as root user: ssh root@sandbox-hdp.hortonworks.com -p 2222 NOTE: If this is your first time signing on the default password is hadoop. 2. Begin installation On CentOS 7, the base OS for the Sandbox, R is available through the Extra Packages for Enterprise Linux (EPEL) package, so we will install it first. yum install epel-release Next, update yum yum update -y
3. Install R and RStudio Let us begin by installing R: yum install R -y Now we may install RStudio Server: wget https://download2.rstudio.org/rstudio-server-rhel-1.1.456-x86_64.rpm
sudo yum install rstudio-server-rhel-1.1.456-x86_64.rpm NOTE: You can find the newest RStudio release here under Redhat/CentOS 64bit. Finally verify that the server is up and running: systemctl status rstudio-server.service You should see a message stating that the server is active. 4. Assigning a Different Port for RStudio Install dpkg to divert the location of /sbin/initctl and assign a different port for Rstudio: yum install -y dpkg
dpkg-divert --local --rename --add /sbin/initctl
ln -s /bin/true /sbin/initctl By default RStudio accepts connections on port 8787; however, the Sandbox uses this port for another service, so we must assign the server a different port (In our case we will use port 60000). echo "www-port=60000" | sudo tee -a /etc/rstudio/rserver.conf The next command will restart the server: NOTE: The command will end the SSH connection to the Sandbox, do not panic, this is expected. exec /usr/lib/rstudio-server/bin/rserver 5. Begin using RStudio Open a web browser and navigate to: http://sandbox-hdp.hortonworks.com:60000 You should see a Sign in Screen for RStudio: Your Username is amy_ds and the password is amy_ds. Congratulations! You may now start using RStudio along with the tools included in the HDP Sandbox for an enhanced Data Science experience. Summary In this tutorial we learned how to install RStudio and change the configuration file for the server to change the default RStudio port to avoid conflicts on our sandbox. Further Reading You can go to following links to explore tutorials using RStudio: Predicting Airline Delays using SparkR
... View more
Labels:
07-12-2018
06:23 PM
1 Kudo
Hello, The HDP Sandbox archive can be found here. Then select the Sandbox Archive section: Lastly scroll down and select HDP 2.5.
... View more