Support Questions

Find answers, ask questions, and share your expertise

How to remotely connect to HDP2.4VM with RStudio using R?

Explorer

Hi,

I have experience using Hive with Ambari. However, I would like to use Hive on the RDP2.4VM with RStudio using R. Simply put, connecting to the hortonwork vm remotely using R. If a user has done this, can they please tell me here how to do this and also where to find literature online on how to accomplish this if it is documented? I would also appreciate any tips on how to set up the dependencies if this was accomplished using rHive.

Thanks, Heath

1 ACCEPTED SOLUTION

14 REPLIES 14

@Heath Yates

Take a look at the below posting. It lists all the dependencies as well as setup instructions (not all steps will apply to you though).

http://www.rdatamining.com/big-data/r-hadoop-setup-guide

Explorer

Sure. Will you be around tonight in case I have questions? I'll try this in an hour or two after my commute home and dinner. I actually got RStudio working on the HDP sandbox. Do you think that will make things simpler? At that point things local and hopefully simpler. I do hope to be able to access a Hadoop/hive server remotely eventually, but hope doing things locally will simplify this problem.

@Heath Yates

link for RStudio Commercial pro Version:

https://www.rstudio.com/products/rstudio/download-commercial/ Pro will work for 45 days without license.

Download Server:

https://www.rstudio.com/products/rstudio/download-server/

Documentation:

https://s3.amazonaws.com/rstudio-server/rstudio-server-pro-0.99.903-admin-guide.pdf

URL for connecting Remotely

http://<SandBox IP:8787/auth-sign-in

Explorer

Going to try HADOOP_HOME and HIVE_HOME paths soon with the tutorial I found. I will let you know if it works. In the meantime, could you please tell me how you found that path information? I am willing to learn and appreciate the time you have taken to reply.

I installed RStudio on my production environment and handling Since 2 yrs.

Explorer

I am getting the error '/root/RHive/usr/lib/hive/lib does not exist' when I do ant build in the ~/Rhive directory. Please see tutorial here for details. I am stuck on step 4.

Here is the link for RHive

https://github.com/nexr/RHive

Just FYI..RHive & rhdfs both are same.

Explorer

Just kidding, the error has not resolved. I will mark yours as answer if I can get ant to build Rhive or get Hive working R. 🙂

sure..If this is what you wanted, please vote the response and accepted it as a best answer.

Explorer

I still am getting the error for step 4. You mentioned rhdfs and rhive are both the same. Which is newer and which one should I be using in R then? I am just trying to get Hive functionality in my Rscripts. Thanks. 🙂

Explorer

I think we are close? The error states 'BUILD FAILED' at /root/Rhive/build.xml:39: /root/RHive/usr/hdp/current/hive-server2/lib does not exist'. Shouldn't it be looking in /usr/hdp/current and not in /root/Rhive? Not sure what build.xml is doing. 😞

you can try tar.gz file from below link if you have an issue with Build.xml

https://cran.r-project.org/src/contrib/Archive/RHive/

R CMD INSTALL RHive_2.0-0.10.tar.gz

Explorer

I'm not sure if this will fix the problem. I will try, but curios as to why you suggest this? I think this original question I asked here has been resolved, but this ant build problem has merited a separate question and asked it here.

Explorer

Here is a broken tutorial. I think we need the updated HADOOP_HOME and HIVE paths. Can someone please help? See tutorial for details.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.