Member since
11-16-2016
2
Posts
0
Kudos Received
0
Solutions
11-21-2016
02:32 PM
Thanks, Sandeep. I have already tried that and received: Loaded plugins: fastestmirror, priorities Setting up Install Process Determining fastest mirrors Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=6&arch=x86_64&repo=os&infra=stock
error was 14: PYCURL ERROR 6 - "Couldn't resolve host
'mirrorlist.centos.org'" Error: Cannot find a valid baseurl for repo: base
... View more
11-16-2016
04:37 PM
I currently have HDP 2.4 which is running Python 2.6.6 When I run the following in Zeppelin: %pyspark import numpy import scipy import pandas import matplotlib I get failed to start
pyspark sometimes, I get numpy not found
When I run the following in Zeppelin: System.getenv().get("MASTER") System.getenv().get("SPARK_YARN_JAR") System.getenv().get("HADOOP_CONF_DIR") System.getenv().get("JAVA_HOME") System.getenv().get("SPARK_HOME") System.getenv().get("PYSPARK_PYTHON") System.getenv().get("PYTHONPATH") System.getenv().get("ZEPPELIN_JAVA_OPTS") System.getenv().get("ZEPPELIN_PORT") I get: res0: String = yarn-client res1: String =
hdfs:///apps/zeppelin/zeppelin-spark-0.5.5-SNAPSHOT.jar res2: String = /usr/hdp/current/hadoop-client/conf res3: String = /usr/lib/jvm/java res4: String = /usr/hdp/2.4.0.0-169/spark res5: String = null res6: String =
/usr/hdp/current/spark-client//python/lib/py4j-0.9-src.zip:/usr/hdp/current/spark-client//python/: res7: String = -Dhdp.version=2.4.0.0-169 -Dspark.executor.memory=512m -Dspark.executor.instances=2 -Dspark.yarn.queue=default res8: String = null I also see the following directories /usr/hdp/2.4.0.0-169/spark/python/pyspark/mllib /usr/hdp/2.4.0.0-169/spark/python/pyspark/ml When I click on the interpreter option, I see: spark %spark
(default), %pyspark, %sql, %dep zeppelin.pyspark.python /usr/hdp/2.4.0.0-169/spark/python/pyspark These are the contents of the zeppelin-env.sh file export MASTER=yarn-client export
SPARK_YARN_JAR=hdfs:///apps/zeppelin/zeppelin-spark-0.5.5-SNAPSHOT.jar export HADOOP_CONF_DIR=/etc/hadoop/conf export JAVA_HOME=/usr/lib/jvm/java export SPARK_HOME=/usr/hdp/current/spark-client/ #export PYSPARK_PYTHON= export
PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip" export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.4.0.0-169 -Dspark.executor.memory=512m -Dspark.executor.instances=2
-Dspark.yarn.queue=default"
... View more
Labels: