Reply
Highlighted
Explorer
Posts: 21
Registered: ‎03-05-2016

Pycharm error: Java gateway process exited before sending the driver its port number

Dear community,

 

I'm getting an error in Pycharm (CDH 5.8.0 and Spark 1.6.2) with the following code, that was working on my Mac in a standalone mode:

 

#!/usr/bin/env python

# Imports
import sys
import os
import logging

# Path for spark source folder
os.environ['SPARK_HOME'] = "/opt/cloudera/parcels/CDH/lib/spark"
os.environ['JAVA_HOME'] = "/opt/jdk1.8.0_101/bin/"
os.environ['PYSPARK_SUBMIT_ARGS'] = "--master yarn pyspark-shell"

# Append pyspark  to Python Path
sys.path.append("/opt/cloudera/parcels/CDH/lib/spark/python/")
sys.path.append("/opt/jdk1.8.0_101/bin/")

try:
    from pyspark import SparkContext
    from pyspark import SparkConf
    from pyspark.sql import HiveContext

    logging.info("Successfully imported Spark Modules")

except ImportError as e:
    logging.error("Can not import Spark Modules", e)
    sys.exit(1)

# CONSTANTS

APP_NAME = "Spark Application Template"

# Main functionality

def main(sc):

    logging.info("string main program: ")
    rdd = sc.parallelize(range(10000), 10)
    print rdd.mean()

if __name__ == "__main__":
    # Configure OPTIONS
    conf = SparkConf().setAppName(APP_NAME)
    conf = conf.setMaster("yarn")
    sc = SparkContext(conf=conf)
    # set the log-level
    sc.setLogLevel("ERROR")

    # Execute Main functionality
    main(sc)

The Spark context cannot be created.

Using the pyspark shell, everything is working!!!

 

Thanks for the suggestions.

Cheers

Gerd

Announcements