Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

R Studio and Cloudera

R Studio and Cloudera

New Contributor

Hello,

Since Jupyter Notebooks are not compatible with Cloudera 5.14, we are evaluating the use of R Studio in order our Data Scientists can run their programs in Spark. I have seen that it would be necessary R Studio Server to connect to a remote Spark cluster, is it an environment supported by Cloudera? Would it be possible to have R Studio Server outside the cluster? Is there any way to use R Studio Desktop with a remote Spark cluster?

In another sense, if we want to use Python and Spark, will you be supported, for example, using Spyder connected remotely to a kernel running in the Cloudera cluster?
Thanks in advance,

Iñigo

2 REPLIES 2
Highlighted

Re: R Studio and Cloudera

Expert Contributor

Hello @Inigo,

 

By using sparklyr you can connect to spark cluster.

 

Also, if you are in evaluating/initial phase of deciding which tools to use. Then you may want to also compare Cloudera's Data Science Workbench, if you haven't already done that.

 

Hope that helps.

Re: R Studio and Cloudera

New Contributor

Thank you @Consult , our idea is to use sparklyr to connect to Spark in our cluster, but using RStudio Desktop or RStudio Server. In our case, RStudio Server is outside the cluster, which steps should we follow to connect to a remote spark cluster?

Cloudera Datascience Workbench is an option we may evaluate in the future, regarding this, is it necessary a separte node(s) for CDS to run? Could it run in an existing edge node?

 

Don't have an account?
Coming from Hortonworks? Activate your account here