Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Unable to run multiple pyspark sessions

avatar
Explorer

I am new to coudera. I have installed cloudera express on a Centos 7 VM, and created a cluster with 4 nodes(another 4 VMs). I ssh to the master node and run: pyspark

This works but only for one session. If I open another console and run pyspark I will get the following error:

 

WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.

 

And it gets stuck there and does nothing until I close the other session running pyspark! Any idea why this is happening and how I can fix this so multiple sessions/user can run pyspark? Am I missing some configurations somewhere?

 

Thanks in advance for your help.

11 REPLIES 11

avatar
Champion

@hedy

 

In general one port will allow one session (one connection) at a time, so your 1st session connects to the default port 4040 and your 2nd session is trying to connect to the same port but got the bind issue, so trying to connect to the next port but it is not working

 

there are two things that you need to check

1. please make sure the port 4041 is open 

2. On your second session, when you run pyspark, pass the avilable port as a parameter.

 

     Ex: Long back i've used spark-shell with different port as parameter, pls try similar option for pyspark

     session1: $ spark-shell --conf spark.ui.port=4040
     session2: $ spark-shell --conf spark.ui.port=4041

 

     if 4041 is not working you can try upto 4057, i think thease are the available port for spark by default

avatar
Explorer

Thank you for your help. I tried different ports, but it still doesn't work,unless I kill the running session and start another one. Can it be that I had wrong configuriation(s) during cloudera installation?  Or changes needed to be made in any configuration files or somewhere else?

avatar
Champion

@hedy

 

did you get a chance to get answer for my first question

avatar
Explorer

@saranvisa Sorry forgot to mention that... yes I did. The port is open.

avatar
Champion

@hedy

 

can you try to run the 2nd pyspark command from a different user id?

 

because it seems this is normal issue according to the below link

 

https://support.datastax.com/hc/en-us/articles/207356773-FAQ-Warning-message-java-net-BindException-...

 

avatar
Explorer

@saranvisa

 

Just tried that. It's not working for different users either.

avatar
Explorer

It looks like things cannot run in parallel but more in a queue form. Maybe missed/misconfgured something in the installation process. 

avatar
Master Collaborator

 

WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.

 

^ This generally means that the problem is beyond the port mapping ( i.e either with queue configuration/ available resources/YARN level).

 

Assuming that you are using spark1.6, I'd suggest to temporarily change the shell logging level to INFO and see if that gives a hint. The easy and quick way to do this would be to edit /etc/spark/conf/log4j.properties from the node you are running pyspark and modify the log level from WARN to INFO.

 

# vi /etc/spark/conf/log4j.properties 
shell.log.level=INFO
 

$ spark-shell
....
18/04/10 20:40:50 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
18/04/10 20:40:50 INFO util.Utils: Successfully started service 'SparkUI' on port 4041.
18/04/10 20:40:50 INFO client.RMProxy: Connecting to ResourceManager at host-xxx.cloudera.com/10.xx.xx.xx:8032
18/04/10 20:40:52 INFO impl.YarnClientImpl: Submitted application application_1522940183682_0060
18/04/10 20:40:54 INFO yarn.Client: Application report for application_1522940183682_0060 (state: ACCEPTED)
18/04/10 20:40:55 INFO yarn.Client: Application report for application_1522940183682_0060 (state: ACCEPTED)
18/04/10 20:40:56 INFO yarn.Client: Application report for application_1522940183682_0060 (state: ACCEPTED)
18/04/10 20:40:57 INFO yarn.Client: Application report for application_1522940183682_0060 (state: ACCEPTED)

 

 

Next, open the Resource Manager UI and check the state of the Application (i.e your second invocation of pyspark) -- whether it's is registered but just stuck in ACCEPTED state like this:

 

Screen Shot 2018-04-11 at 9.22.53 am.png

 

 

If yes, look at the Cluster Metrics row at the top of the RM UI page and see if there are enough resources available:

 

Screen Shot 2018-04-11 at 9.31.10 am.png

 

 

Now kill the first pyspark session and check if the second session changes the state RUNNING in the RM UI. If yes, look at the queue placement rules and stats in Cloudera Manager > Yarn > Resource Pools Usage (and Configuration)

 

 

Screen Shot 2018-04-11 at 9.59.44 am.png

 

 

Hopefully, this would give us some more clues. Let us know how it goes? Feel free to share the screen-shots from the RM UI and spark-shell INFO logging.

avatar
Explorer

Thanks. I really appreciate your response. My advisor actually found out that this will work if we use the following command:

 

$ pysark --master local[i]  

 

where i is a number. Using this command, multiple pyspark shells could run concurrently. But why the other solutions did not work, I have no clue!