Reply
Contributor
Posts: 55
Registered: ‎06-24-2018

pyspark gets stuck on launch - not responding - i am running out of time

Hello, 

 

Due to limitations of SPARK ML 1.6 , i had to upgrade spark to Spark 2 , every configuration is fine. 

 

I have 4 host cluster, if i launch pyspark from master its gets stucked at launch or otherwise it will show warn that couldn't find ui,port trying to connect 4041 etc.

 

Strange thing here is all those ports ar unoccupied , can somebody help ?

 

 

77777.png40530779_518637238564555_3673338773531262976_n.png

Master
Posts: 430
Registered: ‎07-01-2015

Re: pyspark gets stuck on launch - not responding - i am running out of time

I dont kno why it is stucked, but the warning is about the open ports. Each spark program (driver) opens a port, starting from 4040 onwards. So when 4040 is occupied (by another spark driver) it tries 4041, and so on. Until it reach a maximum port number and returns error
Highlighted
Posts: 519
Topics: 14
Kudos: 92
Solutions: 45
Registered: ‎09-02-2016

Re: pyspark gets stuck on launch - not responding - i am running out of time

@hadoopNoob

 

yes, it may be due to port, pls try the below

 

export SPARK_MAJOR_VERSION=2
pyspark --master yarn --conf spark.ui.port=12888
pyspark --master yarn --conf spark.ui.port=4041
pyspark --master yarn --conf spark.ui.port=4042
etc

Contributor
Posts: 55
Registered: ‎06-24-2018

Re: pyspark gets stuck on launch - not responding - i am running out of time

i tried your suggestion already, but did it again and now it gets stuck here. upon using ctrl plus c it skips to executor

 

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLeve l(newLevel).

 

 

 

strange thing is it works fine on other nodes, should i use them then ?

Posts: 519
Topics: 14
Kudos: 92
Solutions: 45
Registered: ‎09-02-2016

Re: pyspark gets stuck on launch - not responding - i am running out of time

@hadoopNoob

 

if the command is working on the other nodes then run the netstat command again on both the nodes (for the port starting 4040) to see the difference. 

 

it is clear that it is not a spark issue as it is working form other nodes. so you have to identify the port open/availability status