Support Questions

Find answers, ask questions, and share your expertise

PySpark issue on Windows (Java gateway process exited before sending the driver its port number)

avatar
Explorer

 

Hello , I have installed Spark 1.3.1 binary hadoop on my windows machine. I had setup appropriate paths ( see below) but when I run, it is not able to initiate

context and it fails with error:

 

Traceback (most recent call last):
  File "C:/Users/N600173/PycharmProjects/Test/test.py", line 30, in <module>
    sc = SparkContext(master="local")
  File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\context.py", line 107, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway)
  File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\context.py", line 221, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "C:\dev\spark-1.3.1-bin-hadoop2.3\python\pyspark\java_gateway.py", line 79, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

 

What does this mean ? Note : I am able to run successfully the Spark shell.

 

 

Code:

os.environ['SPARK_HOME'] = "C:/spark-1.2.0-bin-hadoop2.3"
os.environ['JAVA_HOME'] = "C:/jdk1.7.0_45"
sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/python")
sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/python/lib/py4j-0.8.2.1-src.zip")
sys.path.append("C:/spark-1.2.0-bin-hadoop2.3/lib/spark-assembly-1.2.0-hadoop2.3.0.jar")
sys.path.append("C:/jdk1.7.0_45/bin")
from pyspark import SparkContext
sc = SparkContext(master="local") <-- This fails !

 

1 REPLY 1

avatar
Super Collaborator

Can you check the path separator? I would have expected that on windows you would use the \ and not the /

can you also explain how you start PySpark: do you use the cmd scripts or under cygwin?

 

BTW: we do not test windows as a client, so you might see a known issue

 

Wilfred