Member since
12-11-2015
4
Posts
0
Kudos Received
0
Solutions
12-12-2015
12:05 PM
Here's the contents of /etc/spark/conf/spark-env.sh ##
# Generated by Cloudera Manager and should not be modified directly
##
SELF="$(cd $(dirname $BASH_SOURCE) && pwd)"
if [ -z "$SPARK_CONF_DIR" ]; then
export SPARK_CONF_DIR="$SELF"
fi
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/spark
export DEFAULT_HADOOP_HOME=/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hadoop
### Path of Spark assembly jar in HDFS
export SPARK_JAR_HDFS_PATH=${SPARK_JAR_HDFS_PATH:-''}
export HADOOP_HOME=${HADOOP_HOME:-$DEFAULT_HADOOP_HOME}
if [ -n "$HADOOP_HOME" ]; then
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${HADOOP_HOME}/lib/native
fi
SPARK_EXTRA_LIB_PATH=""
if [ -n "$SPARK_EXTRA_LIB_PATH" ]; then
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SPARK_EXTRA_LIB_PATH
fi
export LD_LIBRARY_PATH
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-$SPARK_CONF_DIR/yarn-conf}
# This is needed to support old CDH versions that use a forked version
# of compute-classpath.sh.
export SCALA_LIBRARY_PATH=${SPARK_HOME}/lib
# Set distribution classpath. This is only used in CDH 5.3 and later.
export SPARK_DIST_CLASSPATH=$(paste -sd: "$SELF/classpath.txt")
export SPARK_LOCAL_DIRS=/dev/shm
#SPARK_DIST_CLASSPATH="$SPARK_DIST_CLASSPATH:/usr/lib/solr/*:/usr/lib/solr/lib/*"
... View more
12-12-2015
11:28 AM
yes, "spark-submit TestPyEnv.py throws an error in a clean env too and thats why i was trying to see if setting the env variables would help...
... View more
12-11-2015
10:25 PM
I'm setting the below exports from the shell. export SPARK_HOME="/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/spark"export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/build:$PYTHONPATH
PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/build:$PYTHONPATH Running the below code as spark-submit TestPyEnv.py import osimport sys
# Path for spark source folder
#os.environ['SPARK_HOME']="/opt/cloudera/parcels/CDH/lib/spark"
# Append pyspark to Python Path
#sys.path.append("/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/spark/python/")help('modules')
try:
from pyspark import SparkContext
from pyspark import SparkConf
print ("Successfully imported Spark Modules")
except ImportError as e:
print ("Can not import Spark Modules", e) sys.exit(1) I'm not able to figure out for the life of my why the SparkContext is not working. ('Can not import Spark Modules', ImportError('cannot import name SparkContext',))
... View more
Labels: