Member since
02-11-2016
10
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
12373 | 02-25-2016 07:54 PM |
12-05-2016
08:11 PM
Thanks @Artem Ervits , Ill update this once I figure out what we end up doing
... View more
12-05-2016
05:59 PM
@Artem Ervits, that doc makes no reference to hawq or gp admin. When I install, I'm blocked because the Ambari Hawq installer tries to create a user gpadmin and /home/gpadmin. This is not allowed.
... View more
12-05-2016
04:27 PM
Hi, My company's policy is to have all service accounts follow certain standards. A user named "gpadmin", does not meet our standards. Is there a way to have a different system user? I looked at the code and it looks like it could be modified, but generally that would eliminate the option of support.
... View more
02-26-2016
03:08 PM
1 Kudo
@Piotr Kuźmiak What I had to do in order to resolve was clone the latest zeppelin from: https://github.com/apache/incubator-zeppelin
Build it using maven and then update my zeppelin-env.sh and put the port number I wanted in zeppelin-site.xml I didn't have to change anything in the Zeppelin GUI. Here is what is set in my zeppelin-env.sh: export MASTER=yarn-client export ZEPPELIN_PORT=8090 export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.2.0-2950 -Dspark.yarn.queue=default" export SPARK_HOME=/usr/hdp/current/spark-client/ export HADOOP_CONF_DIR=/etc/hadoop/conf export PYSPARK_PYTHON=/usr/bin/python export PYTHONPATH=${SPARK_HOME}/python:${SPARK_HOME}/python/build:$PYTHONPATH
... View more
02-25-2016
07:54 PM
1 Kudo
There was a bug in Zeppelin, it was fixed by Mina Lee and then committed a day ago.
... View more
02-12-2016
03:43 PM
2 Kudos
@Neeraj Sabharwal I've also tried adding the pythonpath directly in the interpreter configs from the Zeppeling GUI, by creating a variable zeppelin.pyspark.pythonpath. I even tried exporting the PYTHONPATH variable from the Linux CLI. None of these worked. What bothers me, is that the pythonpath is not changing, and I'm always getting the same error shown above.
... View more
02-12-2016
03:43 PM
1 Kudo
@Neeraj Sabharwal The Jira issue and tutorial in your comments are completely unrelated to my issue. I previously found the link to the Apache mail archives. It's about using pyspark on yarn, which I can do via the CLI. The only problem is with Zeppelin. It ignores the pythonpath in zeppelin-env.sh (the pythonpath is the same as in spark-env.sh).
... View more
02-11-2016
07:31 PM
2 Kudos
Hi, I've been trying unsuccessfully to configure the pyspark interpreter on Zeppelin. I can use pyspark from the CLI and can use the Spark interpreter from Zeppelin without issue. Here are the lines which aren't commented out in my zeppelin-env.sh file: export MASTER=yarn-client
export ZEPPELIN_PORT=8090 export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.3.2.0-2950 -Dspark.yarn.queue=default"
export SPARK_HOME=/usr/hdp/current/spark-client/
export HADOOP_CONF_DIR=/etc/hadoop/conf export PYSPARK_PYTHON=/usr/bin/python export PYTHONPATH=${SPARK_HOME}/python:${SPARK_HOME}/python/build:$PYTHONPATH Running a simple pyspark script in the interpreter gives this error: Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 5, some_yarn_node.networkname): org.apache.spark.SparkException:
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/app/hadoop/yarn/local/usercache/my_username/filecache/4121/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar I've tried adding this line to zeppelin-env.sh, which gives the same error above: export PYTHONPATH=/usr/hdp/current/spark-client/python:/usr/hdp/current/spark-client/python/lib/pyspark.zip:/usr/hdp/current/spark-client/python/lib/py4j-0.8.2.1-src.zip I've tried everything I could find on Google, any advice for debugging or fixing this problem? Thanks, Ian Also, in case it's useful for debugging here are some commands and outputs below: System.getenv().get("MASTER") System.getenv().get("SPARK_YARN_JAR") System.getenv().get("HADOOP_CONF_DIR") System.getenv().get("JAVA_HOME") System.getenv().get("SPARK_HOME") System.getenv().get("PYSPARK_PYTHON") System.getenv().get("PYTHONPATH") System.getenv().get("ZEPPELIN_JAVA_OPTS")
res49: String = yarn-client
res50: String = null res51: String = /etc/hadoop/conf
res52: String = /usr/jdk64/jdk1.7.0_45 res53: String = /usr/hdp/2.3.2.0-2950/spark res54: String = /usr/bin/python
res55: String = /usr/hdp/2.3.2.0-2950/spark/python:/usr/hdp/2.3.2.0-2950/spark/python/build:/usr/hdp/current/spark-client//python/lib/py4j-0.8.2.1-src.zip:/usr/hdp/current/spark-client//python/:/usr/hdp/current/spark-client//python:/usr/hdp/current/spark-client//python/build:/usr/hdp/current/spark-client//python:/usr/hdp/current/spark-client//python/build: res56: String = -Dhdp.version=2.3.2.0-2950
... View more
Labels:
- Labels:
-
Apache Zeppelin