Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

Livy "Interpreter died" with PySpark

New Contributor




On top of my ubuntu based docker image (Python 3.6, Spark 2.2.2) I installed Livy following, and trying to create a spark server in the local mode (to start with).


In the building process, I specified my Spark version like following :  

export SPARK_VERSION=2.2.2

mvn -DskipTests -Dspark.version=$SPARK_VERSION clean package


It works perfectly with "Spark Example" in 

However, "PySpark Example" does not work giving the message "Interpreter died" (Please see below)


I tried different settings in the file under  /usr/local/livy/conf folder without success.


Can anyone tell me what can be the cause, and how to debug ? (It does not say anything in the Spark log...)


I'd very much appreciate your help.



Here is a snippet of the client :


import json, pprint, requests, textwrap
host = 'http://localhost:8998'
data = {'kind': 'pyspark'}
headers = {'Content-Type': 'application/json'}
r = + '/sessions', data=json.dumps(data), headers=headers)
{'id': 0, 'appId': None, 'owner': None, 'proxyUser': None, 'state': 'starting', 'kind': 'pyspark', 'appInfo': {'driverLogUrl': None, 'sparkUiUrl': None}, 'log': []}
session_url = host + r.headers['location']
r2 = requests.get(session_url, headers=headers)
{'id': 0,
'appId': None,
'owner': None,
'proxyUser': None,
'state': 'idle',
'kind': 'pyspark',
'appInfo': {'driverLogUrl': None, 'sparkUiUrl': None},
'log': []}
statements_url = session_url + '/statements'

data = {
  'code': textwrap.dedent("""
    import random
    NUM_SAMPLES = 100000
    def sample(p):
      x, y = random.random(), random.random()
      return 1 if x*x + y*y < 1 else 0

    count = sc.parallelize(xrange(0, NUM_SAMPLES)).map(sample).reduce(lambda a, b: a + b)
    print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))
r =, data=json.dumps(data), headers=headers)
{'id': 0, 'output': None, 'progress': 0.0, 'state': 'waiting'}
r = requests.get(statements_url, headers=headers)
{'statements': [{'id': 0, 'output': {'ename': 'Error', 'evalue': 'Interpreter died:\n', 'execution_count': 0, 'status': 'error', 'traceback': []}, 'progress': 1.0, 'state': 'available'}], 'total_statements': 1}
Who agreed with this topic