Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

sparkR - Error in socketConnection(port = monitorPort)

avatar
Expert Contributor

I have centos 7.1.

On my multinode Hadoop cluster (2.3.4) I have , through Ambari, installed spark 1.5.2. I am trying to connect to sparkR from CLI and after I run sparkR I get the following error:

Error in value[[3L]](cond) : Failed to connect JVM In addition: Warning message: In socketConnection(host = hostname, port = port, server = FALSE, : localhost:9001 cannot be opened

The port (9001) is opened on the namenode (where Im running sparkR) Do you have any ideas what Im doing wrong? Ive seen this link: http://hortonworks.com/hadoop-tutorial/apache-spark-1-5-1-technical-preview-with-hdp-2-3/

and I followed also this link:

http://www.jason-french.com/blog/2013/03/11/installing-r-in-linux/

To install R on all datanodes. I appreicate your contribution.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Neeraj Sabharwal, @Artem Ervits

I made it work now!

Ive used Ubuntu 14.04 Trusty, installed manually Spark 1.4.1 and set up sparkR. Now, I dont know if the problem was in centos 7.2, but the installment of R was different than what Ive done earlier and from what it says here:

http://www.jason-french.com/blog/2013/03/11/installing-r-in-linux/

If you guys want, I can try the same on centos 7.2 and report. If you want, I can describe the process of preparing the environment for using sparkR. I will also try on other spark versions. We depend on R because of the research.

Let me know if there is interest.

View solution in original post

33 REPLIES 33

avatar
Master Mentor

avatar
Master Mentor

@marko check whether firewall is blocking the port on each node.

avatar
Expert Contributor

@Artem Ervits

Ive seen this one as well, dont see a big difference between this one and 1.5.2

I have SPARK_HOME and JAVA_HOME defined.

My hive-site.xml is also on its place.

If I scroll down to the SparkR part: R is installed on all the nodes.

by the way, when I run sparkR, I dont get the nice Spark graphic (logo) seems as if Im starting just R.

avatar
Master Mentor

@marko

I recommend this http://hortonworks.com/hadoop/spark/#section_6 but your link is good too based on your spark version.

avatar
Master Mentor

@marko Now, let's troubleshoot 9001 issue.

netstat -anp | grep 9001 --> whats the output?

avatar
Expert Contributor

@Neeraj Sabharwal

running sudo netstat -anp | grep 9001

returns:

unix 2 [ ACC ] STREAM LISTENING 9001 1202/master private/proxywrite

avatar
Master Mentor

@marko ps -ef | grep 1202

If you don't need it then kill it ...

avatar
Expert Contributor

@Neeraj Sabharwal

I killed process, also restarted spark from ambari.

If I run sudo netstat -anp | grep 9001 I dont see anything.

I also have this one in my bashrc on the node where Im running sparkR:

export EXISTING_SPARKR_BACKEND_PORT=9001

Funny thing, if i run sparkR with my centos user I get the error mentioned in the original post.

If i run sudo -u spark sparkR then I get:

Error in socketConnection(port = monitorPort) : cannot open the connection In addition: Warning message: In socketConnection(port = monitorPort) : localhost:53654 cannot be opened

avatar
Master Mentor

@marko Interesting...see this

http://pastebin.com/VefGqMea