Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

sparkR with HDP 2.4 deployed on AWS ec2 connection error

avatar

I am trying to use SparkR in Rstudio in HDP2.4 deployed on AWS EC2 cluster. I installed R, Rstudio and other R packages but after I login to R and try to start spark context, I encountered the problem below.

Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths()))

library(SparkR)

# Create a spark context and a SQL context
sc <- SparkR::sparkR.init(master = "yarn-client")

Retrying connect to server: ip-xxx-xx-xx-xx.ec2.internal/xxx.xx.xx.xx:8050. Already tried 49 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
16/09/09 11:41:32 
1 ACCEPTED SOLUTION

avatar
Super Guru

@Fish Berh

I assume that your R-Studio is on your laptop. It seems that you are trying to access an internal IP from your laptop. You need to reference a public IP or the public URI for your server.

View solution in original post

5 REPLIES 5

avatar
Super Guru

@Fish Berh

I assume that your R-Studio is on your laptop. It seems that you are trying to access an internal IP from your laptop. You need to reference a public IP or the public URI for your server.

avatar
Super Guru

@Fish Berh

You may also want to check a few options here:

https://blog.rstudio.org/tag/sparkr/

avatar

Actually, I logged in to Rstudio using the address of the server:

ec2-yy-yyy-yyy-yy.compute-1.amazonaws.com:8787

avatar
Super Guru

@Fish Berh

Could you vote and accept my response? I suggested using the public URI of the server.

avatar

I used public URI of the server to login and got the error.