Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

sparkR - Error in socketConnection(port = monitorPort)

avatar
Expert Contributor

I have centos 7.1.

On my multinode Hadoop cluster (2.3.4) I have , through Ambari, installed spark 1.5.2. I am trying to connect to sparkR from CLI and after I run sparkR I get the following error:

Error in value[[3L]](cond) : Failed to connect JVM In addition: Warning message: In socketConnection(host = hostname, port = port, server = FALSE, : localhost:9001 cannot be opened

The port (9001) is opened on the namenode (where Im running sparkR) Do you have any ideas what Im doing wrong? Ive seen this link: http://hortonworks.com/hadoop-tutorial/apache-spark-1-5-1-technical-preview-with-hdp-2-3/

and I followed also this link:

http://www.jason-french.com/blog/2013/03/11/installing-r-in-linux/

To install R on all datanodes. I appreicate your contribution.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Neeraj Sabharwal, @Artem Ervits

I made it work now!

Ive used Ubuntu 14.04 Trusty, installed manually Spark 1.4.1 and set up sparkR. Now, I dont know if the problem was in centos 7.2, but the installment of R was different than what Ive done earlier and from what it says here:

http://www.jason-french.com/blog/2013/03/11/installing-r-in-linux/

If you guys want, I can try the same on centos 7.2 and report. If you want, I can describe the process of preparing the environment for using sparkR. I will also try on other spark versions. We depend on R because of the research.

Let me know if there is interest.

View solution in original post

33 REPLIES 33

avatar
Expert Contributor

@Neeraj Sabharwal

hmmm..

So this is the part where the show ends for me:

Launching java with spark-submit command /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit "sparkr-shell" /tmp/Rtmp69Q264/backend_portae4c24444ac20

So now I checked if spark-submit works by running the following example:

cd $SPARK_HOME

sudo -u spark ./bin/spark-submit --class org.apache.spark.examples.SparkPi--master yarn-client --num-executors 3--driver-memory 512m--executor-memory 512m--executor-cores 1 lib/spark-examples*.jar 10

And the result is

Lots of these:

INFO Client: Application report for application_1455610402042_0021 (state: ACCEPTED)

Then:

SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.

In the file, the whole error can be found

spark-submit-error.txt

Am I missing something in Spark setup?

avatar
Master Mentor

avatar
Expert Contributor

@Neeraj Sabharwal

I ran the same spark-submit command with ONE difference:

--master was yarn-cluster

I came to the status FINISHED:

INFO Client: Application report for application_1455610402042_0022 (state: FINISHED)

and it ended up with this:

Exception in thread "main" org.apache.spark.SparkException: Application application_1455610402042_0022 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:974) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1020) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497)

...

16/02/16 18:15:58 INFO ShutdownHookManager: Shutdown hook called 16/02/16 18:15:58 INFO ShutdownHookManager: Deleting directory /tmp/spark-3ba9f87c-18c2-4d0d-b360-49fa10408631

avatar
Master Mentor

@marko yarn log -applicationid application_1455610402042_0021

Output of the above command? I hope its not failing because of memory

avatar
Expert Contributor

@Neeraj Sabharwal

Ive ran the command you recommended and Im getting an error saying

Error: Could not find or load main class log

I have setup another cluster - namenode + 3 datanodes using ambari. Ive followed this link:

http://hortonworks.com/hadoop-tutorial/apache-spark-1-6-technical-preview-with-hdp-2-3/

I installed R on all the nodes.

All examples worked until I came to the sparkR:

Launching java with spark-submit command /usr/hdp/2.3.4.0-3485/spark/bin/spark-submit "sparkr-shell" /tmp/Rtmphs2DlM/backend_port3b7b4c9a912b 16/02/17 09:37:36 WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.

Error in socketConnection(port = monitorPort) :

cannot open the connection

In addition: Warning message:

In socketConnection(port = monitorPort) : localhost:40949 cannot be opened

>

Ive opened all the ports (1-65535) and port 0 for the namenode and the datanodes.

@Artem Ervits - do you have any idea what I am missing?

avatar
Expert Contributor

Im looking at the code:

https://github.com/amplab-extras/SparkR-pkg/blob/master/pkg/src/src/main/scala/edu/berkeley/cs/ampla...

env variable EXISTING_SPARKR_BACKEND_PORT can be defined through bashrc,

the try-catch that returns my error is the following:

tryCatch({    
	connectBackend("localhost", backendPort)  
	error = function(err) {    
		stop("Failed to connect JVM\n")

Isnt it interesting that localhost is written in it this way? Or is there an explanation for it?

avatar
Expert Contributor

Now I installed Spark 1.6.0 just to test if Ambari makes some changes during spark installation: Same result:

Error in socketConnection(port = monitorPort) : cannot open the connection In addition: Warning message: In socketConnection(port = monitorPort) : localhost:51604 cannot be opened

Could it be YARN?

avatar
Master Mentor

@marko please check whether firewall is blocking.

avatar
Expert Contributor

@Artem Ervits

I ran:

sudo systemctl status firewalld

And the result is this:

firewalld.service Loaded: not-found (Reason: No such file or directory) Active: inactive (dead)

avatar
Master Mentor

Is this centos7? If not try below, also make sure to do that on all nodes

sudo service iptables stop