Created on 10-12-2017 02:00 AM - edited 08-17-2019 07:45 PM
Hi, I am having trouble when running Pyspark Interpreter. I edited zeppelin.pyspark.python variable with /usr/lib/miniconda2/bin/python. Besides, I also can't execute pandas.
Below is the error in Zeppelin UI
Created 10-12-2017 05:18 AM
Do you have pandas installed ? Try installing pandas and run
conda install pandas
Other useful libraries are matplotlib, numpy if you want to install.
Thanks,
Aditya
Created 10-13-2017 03:40 AM
Hi Aditya, I had installed pandas. I can execute pandas at PySpark CLI tho but not in Zeppelin.
There are errors stated when I run pyspark command:
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java se              rver
Traceback (most recent call last):
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java se              rver
Traceback (most recent call last):
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
Traceback (most recent call last):
  File "/usr/hdp/2.5.3.0-37/spark/python/pyspark/shell.py", line 43, in <module>
    sc = SparkContext(pyFiles=add_files)
  File "/usr/hdp/2.5.3.0-37/spark/python/pyspark/context.py", line 115, in __ini              t__
    conf, jsc, profiler_cls)
  File "/usr/hdp/2.5.3.0-37/spark/python/pyspark/context.py", line 172, in _do_i              nit
    self._jsc = jsc or self._initialize_context(self._conf._jconf)
  File "/usr/hdp/2.5.3.0-37/spark/python/pyspark/context.py", line 235, in _init              ialize_context
    return self._jvm.JavaSparkContext(jconf)
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 1062, in __call__
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 631, in send_command
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 624, in send_command
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 579, in _get_connection
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 585, in _create_connection
  File "/usr/hdp/2.5.3.0-37/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.              py", line 697, in start
py4j.protocol.Py4JNetworkError: An error occurred while trying to connect to the               Java server
>>> import pandas as pd
>>>
					
				
			
			
				
			
			
			
			
			
			
			
		Created 10-13-2017 04:37 AM
Try setting PYSPARK_DRIVER_PYTHON environment variable so that Spark uses Anaconda/Miniconda.
From the logs looks like spark is using pyspark which is bundled
Thanks,
Aditya
Created 10-13-2017 05:39 AM
Should I declare the variable like this?
export PYSPARK_DRIVER_PYTHON=miniconda2
					
				
			
			
				
			
			
			
			
			
			
			
		Created 10-13-2017 06:14 AM
Try setting the below instead of PYSPARK_DRIVER_PYTHON
export PYSPARK_PYTHON=<anaconda python path>
ex: export PYSPARK_PYTHON=/home/ambari/anaconda3/bin/python
Created 10-13-2017 04:39 AM
Try setting PYSPARK_DRIVER environment variable so that Spark uses Anaconda/Miniconda.
From the logs looks like spark is using pyspark which is bundled. Check the link for more info
https://spark.apache.org/docs/1.6.2/programming-guide.html#linking-with-spark
Thanks,
Aditya
Created 10-13-2017 06:15 AM
Try setting the below instead of PYSPARK_DRIVER_PYTHON
export PYSPARK_PYTHON=<anaconda python path>
ex: export PYSPARK_PYTHON=/home/ambari/anaconda3/bin/python
Created on 10-13-2017 06:59 AM - edited 08-17-2019 07:45 PM
Aditya, I got this error at Zeppelin UI