Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

Pycharm error: Java gateway process exited before sending the driver its port number

avatar
Contributor

Dear community,

 

I'm getting an error in Pycharm (CDH 5.8.0 and Spark 1.6.2) with the following code, that was working on my Mac in a standalone mode:

 

#!/usr/bin/env python

# Imports
import sys
import os
import logging

# Path for spark source folder
os.environ['SPARK_HOME'] = "/opt/cloudera/parcels/CDH/lib/spark"
os.environ['JAVA_HOME'] = "/opt/jdk1.8.0_101/bin/"
os.environ['PYSPARK_SUBMIT_ARGS'] = "--master yarn pyspark-shell"

# Append pyspark  to Python Path
sys.path.append("/opt/cloudera/parcels/CDH/lib/spark/python/")
sys.path.append("/opt/jdk1.8.0_101/bin/")

try:
    from pyspark import SparkContext
    from pyspark import SparkConf
    from pyspark.sql import HiveContext

    logging.info("Successfully imported Spark Modules")

except ImportError as e:
    logging.error("Can not import Spark Modules", e)
    sys.exit(1)

# CONSTANTS

APP_NAME = "Spark Application Template"

# Main functionality

def main(sc):

    logging.info("string main program: ")
    rdd = sc.parallelize(range(10000), 10)
    print rdd.mean()

if __name__ == "__main__":
    # Configure OPTIONS
    conf = SparkConf().setAppName(APP_NAME)
    conf = conf.setMaster("yarn")
    sc = SparkContext(conf=conf)
    # set the log-level
    sc.setLogLevel("ERROR")

    # Execute Main functionality
    main(sc)

The Spark context cannot be created.

Using the pyspark shell, everything is working!!!

 

Thanks for the suggestions.

Cheers

Gerd

Who agreed with this topic