Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

sparkR with HDP 2.4 deployed on AWS ec2 connection error

avatar
Contributor

I am trying to use SparkR in Rstudio in HDP2.4 deployed on AWS EC2 cluster. I installed R, Rstudio and other R packages but after I login to R and try to start spark context, I encountered the problem below.

Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths()))

library(SparkR)

# Create a spark context and a SQL context
sc <- SparkR::sparkR.init(master = "yarn-client")

Retrying connect to server: ip-xxx-xx-xx-xx.ec2.internal/xxx.xx.xx.xx:8050. Already tried 49 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
16/09/09 11:41:32 
1 ACCEPTED SOLUTION

avatar
Super Guru

@Fish Berh

I assume that your R-Studio is on your laptop. It seems that you are trying to access an internal IP from your laptop. You need to reference a public IP or the public URI for your server.

View solution in original post

5 REPLIES 5

avatar
Super Guru

@Fish Berh

I assume that your R-Studio is on your laptop. It seems that you are trying to access an internal IP from your laptop. You need to reference a public IP or the public URI for your server.

avatar
Super Guru

@Fish Berh

You may also want to check a few options here:

https://blog.rstudio.org/tag/sparkr/

avatar
Contributor

Actually, I logged in to Rstudio using the address of the server:

ec2-yy-yyy-yyy-yy.compute-1.amazonaws.com:8787

avatar
Super Guru

@Fish Berh

Could you vote and accept my response? I suggested using the public URI of the server.

avatar
Contributor

I used public URI of the server to login and got the error.